Don’t hesitate to ask the course convenor for help via OpenLearning. The course instructor are happy to point you in the right direction and to make suggestions, but they won’t, of course, complete your assessment for you!
The data used for this assessment consist of records from Intensive Care Unit (ICU) hospital stays in the USA. All patients were adults who were admitted for a wide variety of reasons. ICU stays of less than 48 hours have been excluded.
The source data for the assessment are data made freely available for the 2012 MIT PhysioNet/Computing for Cardiology Challenge. Details are provided here. Training Set A data have been used. The original data has been modified and assembled to suit the purpose of this assessment. While not required for the purposes of this assignment, full details of the preparatory work can be found in the ‘hdat9600assess4_final_data_prep’ file.
The dataframe consists of 120 variables, which are defined as follows:
Use the hyperlinks below to find out more about the clinical meaning of each variable. The first two clinical variables are summary scores that are used to assess patient condition and risk.
The following 36 clinical measures were assessed at multiple timepoints during each patient’s ICU stay. For each of the 36 clinical measures, you are given 3 summary variables: a) The minimum value during the first 24 hours in ICU (_min), b) The maximum value during the first 24 hours in ICU (_max), and c) The difference between the mean and the most extreme values during the first 24 hours in ICU (_diff). For example, for the clinical measure Cholesterol, these three variables are labelled ‘Cholesterol_min’, ‘Cholesterol_max’, and ‘Cholesterol_diff’.
The data frame can be loaded with the following code:
icu_patients_df0 <- readRDS("icu_patients_df0.rds")
icu_patients_df1 <- readRDS("icu_patients_df1.rds")
Note: icu_patients_df1 is an imputed (i.e. missing values are ‘derived’) version of icu_patients_df0. This assessment does not concern the methods used for imputation.
In this task, you are required to develop a logistic regression model using the icu_patients_df1 data set which adequately explains or predicts the in_hospital_death variable as the outcome using a subset of the available predictor variables. You should fit a series of models, evaluating each one, before you present your final model. Your final model should not include all the predictor variables, just a small subset of them, which you have selected based on statistical significance and/or background knowledge. It is perfectly acceptable to include predictor variables in your final model which are not statistically significant, as long as you justify their inclusion on medical or physiological grounds (you will not be marked down if your medical justification is not exactly correct or complete, but do you best). Aim for between five and ten predictor variables (slightly more or fewer is OK). You should assess each model you consider for goodness of fit and other relevant statistics to help you choose between them. For your final model, present a set of diagnostic statistics and/or charts and comment on them. You don’t need to do an exhaustive exploratory data analysis of all the variables in the data set, but you should examine those variables that you use in your model. Finally, re-fit your final model to the unimputed data frame (icu_patients_df0.rds) and comment on any differences you find compared to the same model fitted to the imputed data.
Select an initial subset of explanatory variables that you will use to predict the risk of in-hospital death. Justify your choice.
Conduct basic exploratory data analysis on your variables of choice.
Fit appropriate univariate logistic regression models.
Fit an appropriate series of multivariable logistic regression models, justifying your approach. Assess each model you consider for goodness of fit and other relevant statistics.
Present your final model. Your final model should not include all the predictor variables, just a small subset of them, which you have selected based on statistical significance and/or background knowledge.
For your final model, present a set of diagnostic statistics and/or charts and comment on them.
Write a paragraph or two summarizing the most important findings of your final model. Include the most important values from the statistical output, and a simple clinical interpretation.
# Getting the summary statistics of the dataset
summary(icu_patients_df1)
## RecordID Length_of_stay SAPS1 SOFA
## Min. :132539 Min. : -1.00 Min. : 1.00 Min. :-1.000
## 1st Qu.:133875 1st Qu.: 6.00 1st Qu.:11.00 1st Qu.: 3.000
## Median :135146 Median : 10.00 Median :15.00 Median : 6.000
## Mean :135156 Mean : 13.74 Mean :14.96 Mean : 6.441
## 3rd Qu.:136477 3rd Qu.: 17.00 3rd Qu.:19.00 3rd Qu.: 9.000
## Max. :137740 Max. :154.00 Max. :34.00 Max. :22.000
## NA's :96
## Survival in_hospital_death Days Status
## Min. : 0.0 Min. :0.0000 Min. : 0 Mode :logical
## 1st Qu.: 10.0 1st Qu.:0.0000 1st Qu.: 265 FALSE:1288
## Median : 68.0 Median :0.0000 Median :2408 TRUE :773
## Mean : 343.1 Mean :0.1441 Mean :1634
## 3rd Qu.: 420.0 3rd Qu.:0.0000 3rd Qu.:2408
## Max. :2408.0 Max. :1.0000 Max. :2408
## NA's :1288
## Age Albumin_diff Albumin_max Albumin_min
## Min. :16.00 Min. :0.01866 Min. :1.100 Min. :1.100
## 1st Qu.:52.00 1st Qu.:0.28134 1st Qu.:2.600 1st Qu.:2.600
## Median :67.00 Median :0.48134 Median :3.000 Median :3.000
## Mean :64.41 Mean :0.56829 Mean :3.045 Mean :3.012
## 3rd Qu.:78.00 3rd Qu.:0.81866 3rd Qu.:3.500 3rd Qu.:3.500
## Max. :90.00 Max. :2.31866 Max. :5.300 Max. :5.300
##
## ALP_diff ALP_max ALP_min ALT_diff
## Min. : 0.148 Min. : 19.0 Min. : 19.0 Min. : 0.446
## 1st Qu.: 21.852 1st Qu.: 57.0 1st Qu.: 58.0 1st Qu.: 89.446
## Median : 37.852 Median : 78.0 Median : 76.0 Median : 102.446
## Mean : 56.259 Mean : 105.7 Mean : 101.4 Mean : 154.873
## 3rd Qu.: 54.852 3rd Qu.: 110.0 3rd Qu.: 105.0 3rd Qu.: 108.446
## Max. :1408.148 Max. :1504.0 Max. :1339.0 Max. :10319.554
##
## ALT_max ALT_min AST_diff AST_max
## Min. : 3.0 Min. : 1.0 Min. : 0.647 Min. : 5.0
## 1st Qu.: 17.0 1st Qu.: 17.0 1st Qu.: 123.353 1st Qu.: 27.0
## Median : 30.0 Median : 30.0 Median : 142.353 Median : 51.0
## Mean : 118.3 Mean : 90.1 Mean : 227.991 Mean : 188.1
## 3rd Qu.: 69.0 3rd Qu.: 69.0 3rd Qu.: 152.353 3rd Qu.: 130.0
## Max. :10440.0 Max. :9240.0 Max. :15870.647 Max. :16040.0
##
## AST_min Bilirubin_diff Bilirubin_max Bilirubin_min
## Min. : 5.0 Min. : 0.03596 Min. : 0.100 Min. : 0.100
## 1st Qu.: 24.0 1st Qu.: 1.06404 1st Qu.: 0.400 1st Qu.: 0.400
## Median : 42.0 Median : 1.36404 Median : 0.700 Median : 0.600
## Mean : 116.4 Mean : 1.97637 Mean : 1.739 Mean : 1.568
## 3rd Qu.: 87.0 3rd Qu.: 1.46404 3rd Qu.: 1.300 3rd Qu.: 1.100
## Max. :7960.0 Max. :44.13596 Max. :45.900 Max. :45.500
##
## BUN_diff BUN_max BUN_min Cholesterol_diff
## Min. : 0.4729 Min. : 3.00 Min. : 2.00 Min. : 0.5772
## 1st Qu.: 7.4729 1st Qu.: 14.00 1st Qu.: 12.00 1st Qu.: 17.5772
## Median : 11.5270 Median : 20.00 Median : 18.00 Median : 34.4228
## Mean : 15.7904 Mean : 27.48 Mean : 24.44 Mean : 37.2723
## 3rd Qu.: 16.5270 3rd Qu.: 33.00 3rd Qu.: 29.00 3rd Qu.: 55.4228
## Max. :172.4729 Max. :197.00 Max. :157.00 Max. :173.5772
##
## Cholesterol_max Cholesterol_min Creatinine_diff Creatinine_max
## Min. : 59.0 Min. : 59 Min. : 0.03245 Min. : 0.200
## 1st Qu.:122.0 1st Qu.:121 1st Qu.: 0.33245 1st Qu.: 0.800
## Median :152.0 Median :152 Median : 0.53245 Median : 1.000
## Mean :153.4 Mean :153 Mean : 0.86298 Mean : 1.499
## 3rd Qu.:181.0 3rd Qu.:179 3rd Qu.: 0.73245 3rd Qu.: 1.500
## Max. :330.0 Max. :330 Max. :20.76755 Max. :22.000
##
## Creatinine_min DiasABP_diff DiasABP_max DiasABP_min
## Min. : 0.200 Min. : 0.5442 Min. : 22.00 Min. : 2.00
## 1st Qu.: 0.700 1st Qu.: 16.5442 1st Qu.: 68.00 1st Qu.: 40.00
## Median : 0.900 Median : 21.5442 Median : 77.00 Median : 46.00
## Mean : 1.319 Mean : 24.5299 Mean : 78.24 Mean : 46.56
## 3rd Qu.: 1.300 3rd Qu.: 28.4558 3rd Qu.: 86.00 3rd Qu.: 52.00
## Max. :14.100 Max. :209.4558 Max. :268.00 Max. :258.00
## NA's :715 NA's :715 NA's :715
## FiO2_diff FiO2_max FiO2_min GCS_diff
## Min. :0.00192 Min. :0.2800 Min. :0.2800 Min. :0.244
## 1st Qu.:0.15192 1st Qu.:0.5000 1st Qu.:0.4000 1st Qu.:3.756
## Median :0.44808 Median :1.0000 Median :0.4000 Median :3.756
## Mean :0.31376 Mean :0.7874 Mean :0.4863 Mean :5.183
## 3rd Qu.:0.44808 3rd Qu.:1.0000 3rd Qu.:0.5000 3rd Qu.:8.244
## Max. :0.44808 Max. :1.0000 Max. :1.0000 Max. :8.244
##
## GCS_max GCS_min Gender Glucose_diff
## Min. : 3.00 Min. : 3.000 Female: 913 Min. : 0.1445
## 1st Qu.:11.00 1st Qu.: 3.000 Male :1148 1st Qu.: 23.8555
## Median :15.00 Median : 8.000 Median : 39.1445
## Mean :12.87 Mean : 8.773 Mean : 57.0844
## 3rd Qu.:15.00 3rd Qu.:14.000 3rd Qu.: 61.8555
## Max. :15.00 Max. :15.000 Max. :1003.1445
##
## Glucose_max Glucose_min HCO3_diff HCO3_max
## Min. : 39.0 Min. : 24.0 Min. : 0.2275 Min. : 9.00
## 1st Qu.: 117.0 1st Qu.: 98.0 1st Qu.: 1.7725 1st Qu.:22.00
## Median : 141.0 Median :117.0 Median : 3.2275 Median :24.00
## Mean : 163.3 Mean :124.8 Mean : 4.1506 Mean :24.27
## 3rd Qu.: 180.0 3rd Qu.:141.0 3rd Qu.: 5.2275 3rd Qu.:27.00
## Max. :1143.0 Max. :632.0 Max. :24.2275 Max. :47.00
##
## HCO3_min HCT_diff HCT_max HCT_min
## Min. : 5.00 Min. : 0.06013 Min. :21.20 Min. : 9.00
## 1st Qu.:20.00 1st Qu.: 2.96013 1st Qu.:30.00 1st Qu.:26.20
## Median :23.00 Median : 5.16013 Median :33.10 Median :29.60
## Mean :22.43 Mean : 5.70366 Mean :33.57 Mean :30.08
## 3rd Qu.:25.00 3rd Qu.: 7.66013 3rd Qu.:36.70 3rd Qu.:33.70
## Max. :44.00 Max. :23.43987 Max. :54.40 Max. :50.60
##
## Height HR_diff HR_max HR_min
## Min. : 13.0 Min. : 0.9221 Min. : 44.0 Min. : 0.00
## 1st Qu.:162.6 1st Qu.: 20.0779 1st Qu.: 91.0 1st Qu.: 61.00
## Median :170.2 Median : 27.0779 Median :104.0 Median : 71.00
## Mean :170.0 Mean : 30.4294 Mean :106.6 Mean : 71.99
## 3rd Qu.:177.8 3rd Qu.: 36.9221 3rd Qu.:119.0 3rd Qu.: 81.00
## Max. :426.7 Max. :212.9221 Max. :300.0 Max. :126.00
## NA's :992
## ICUType K_diff K_max
## Coronary Care Unit :297 Min. : 0.03521 Min. : 2.500
## Cardiac Surgery Recovery Unit:448 1st Qu.: 0.33521 1st Qu.: 4.000
## Medical ICU :788 Median : 0.56479 Median : 4.300
## Surgical ICU :528 Mean : 0.69010 Mean : 4.419
## 3rd Qu.: 0.86479 3rd Qu.: 4.700
## Max. :18.76479 Max. :22.900
##
## K_min Lactate_diff Lactate_max Lactate_min
## Min. :1.80 Min. : 0.003596 Min. : 0.400 Min. : 0.300
## 1st Qu.:3.50 1st Qu.: 1.096404 1st Qu.: 1.500 1st Qu.: 1.200
## Median :3.90 Median : 1.503596 Median : 2.200 Median : 1.600
## Mean :3.95 Mean : 1.753380 Mean : 2.773 Mean : 1.899
## 3rd Qu.:4.30 3rd Qu.: 1.896404 3rd Qu.: 3.200 3rd Qu.: 2.200
## Max. :6.90 Max. :26.503596 Max. :29.300 Max. :24.200
##
## MAP_diff MAP_max MAP_min Mg_diff
## Min. : 0.2316 Min. : 4.0 Min. : 1.00 Min. :0.0157
## 1st Qu.: 21.7684 1st Qu.: 94.0 1st Qu.: 55.00 1st Qu.:0.1843
## Median : 29.2316 Median :104.0 Median : 61.00 Median :0.3157
## Mean : 38.4735 Mean :111.8 Mean : 62.76 Mean :0.4181
## 3rd Qu.: 41.2316 3rd Qu.:117.0 3rd Qu.: 70.00 3rd Qu.:0.5843
## Max. :213.2316 Max. :291.0 Max. :265.00 Max. :7.9157
##
## Mg_max Mg_min Na_diff Na_max
## Min. :1.100 Min. :0.600 Min. : 0.2066 Min. :112.0
## 1st Qu.:1.900 1st Qu.:1.600 1st Qu.: 1.7934 1st Qu.:137.0
## Median :2.100 Median :1.800 Median : 3.2066 Median :140.0
## Mean :2.153 Mean :1.857 Mean : 4.1146 Mean :139.8
## 3rd Qu.:2.400 3rd Qu.:2.100 3rd Qu.: 5.2066 3rd Qu.:142.0
## Max. :9.900 Max. :6.200 Max. :41.2066 Max. :177.0
##
## Na_min NIDiasABP_diff NIDiasABP_max NIDiasABP_min
## Min. : 98 Min. : 0.491 Min. : 29.00 Min. :10.00
## 1st Qu.:136 1st Qu.: 17.509 1st Qu.: 64.00 1st Qu.:33.00
## Median :138 Median : 25.500 Median : 76.00 Median :42.00
## Mean :138 Mean : 26.964 Mean : 76.92 Mean :43.17
## 3rd Qu.:141 3rd Qu.: 33.509 3rd Qu.: 89.00 3rd Qu.:52.00
## Max. :160 Max. :116.509 Max. :174.00 Max. :97.00
## NA's :455 NA's :455 NA's :455
## NIMAP_diff NIMAP_max NIMAP_min NISysABP_diff
## Min. : 0.0407 Min. : 47.33 Min. : 7.00 Min. : 0.3013
## 1st Qu.: 18.2893 1st Qu.: 81.08 1st Qu.: 52.33 1st Qu.: 25.6987
## Median : 24.7107 Median : 93.67 Median : 60.00 Median : 34.3013
## Mean : 26.9759 Mean : 94.47 Mean : 61.69 Mean : 37.7962
## 3rd Qu.: 33.2893 3rd Qu.:106.00 3rd Qu.: 70.00 3rd Qu.: 45.6987
## Max. :113.2893 Max. :189.00 Max. :121.00 Max. :157.3013
## NA's :455 NA's :455 NA's :455 NA's :453
## NISysABP_max NISysABP_min PaCO2_diff PaCO2_max
## Min. : 78.0 Min. : 4.00 Min. : 0.3358 Min. :16.00
## 1st Qu.:121.0 1st Qu.: 83.00 1st Qu.: 5.6642 1st Qu.:39.00
## Median :138.0 Median : 95.00 Median : 8.6642 Median :44.00
## Mean :140.5 Mean : 96.55 Mean :10.7463 Mean :45.56
## 3rd Qu.:156.0 3rd Qu.:108.00 3rd Qu.:13.3358 3rd Qu.:50.00
## Max. :274.0 Max. :234.00 Max. :57.6642 Max. :98.00
## NA's :453 NA's :453
## PaCO2_min PaO2_diff PaO2_max PaO2_min
## Min. : 0.30 Min. : 0.6179 Min. : 27.0 Min. : 20.0
## 1st Qu.:32.00 1st Qu.: 67.6179 1st Qu.:123.0 1st Qu.: 74.0
## Median :36.00 Median : 90.6179 Median :191.0 Median : 92.0
## Mean :36.72 Mean :119.5407 Mean :223.5 Mean :105.8
## 3rd Qu.:40.00 3rd Qu.:154.3821 3rd Qu.:311.0 3rd Qu.:122.0
## Max. :93.00 Max. :341.3821 Max. :500.0 Max. :477.0
##
## pH_diff pH_max pH_min Platelets_diff
## Min. :0.000114 Min. :7.150 Min. :3.000 Min. : 0.2307
## 1st Qu.:0.059886 1st Qu.:7.380 1st Qu.:7.280 1st Qu.: 39.7693
## Median :0.089886 Median :7.420 Median :7.340 Median : 72.7693
## Mean :0.098486 Mean :7.418 Mean :7.327 Mean : 92.5348
## 3rd Qu.:0.120114 3rd Qu.:7.460 3rd Qu.:7.390 3rd Qu.:116.7693
## Max. :4.369886 Max. :7.690 Max. :7.630 Max. :857.2307
##
## Platelets_max Platelets_min RespRate_diff RespRate_max
## Min. : 18.0 Min. : 9.0 Min. : 0.6514 Min. :13.00
## 1st Qu.: 157.0 1st Qu.:126.0 1st Qu.: 7.3486 1st Qu.:24.00
## Median : 210.0 Median :184.0 Median : 9.6514 Median :27.00
## Mean : 228.9 Mean :197.9 Mean :11.6075 Mean :29.12
## 3rd Qu.: 275.0 3rd Qu.:246.0 3rd Qu.:13.6514 3rd Qu.:33.00
## Max. :1047.0 Max. :891.0 Max. :78.6514 Max. :98.00
##
## RespRate_min SaO2_diff SaO2_max SaO2_min
## Min. : 4.00 Min. : 0.2461 Min. : 75.00 Min. : 33.00
## 1st Qu.:12.00 1st Qu.: 0.7539 1st Qu.: 97.00 1st Qu.: 95.00
## Median :14.00 Median : 1.7539 Median : 98.00 Median : 97.00
## Mean :14.25 Mean : 2.5635 Mean : 97.44 Mean : 95.85
## 3rd Qu.:17.00 3rd Qu.: 3.2461 3rd Qu.: 99.00 3rd Qu.: 98.00
## Max. :24.00 Max. :64.2461 Max. :100.00 Max. :100.00
##
## SysABP_diff SysABP_max SysABP_min Temp_diff
## Min. : 3.689 Min. : 52.0 Min. : 11.00 Min. : 0.1259
## 1st Qu.: 32.310 1st Qu.:135.0 1st Qu.: 79.00 1st Qu.: 0.8741
## Median : 40.690 Median :149.0 Median : 88.00 Median : 1.2741
## Mean : 45.008 Mean :152.1 Mean : 90.91 Mean : 1.3756
## 3rd Qu.: 53.690 3rd Qu.:167.0 3rd Qu.:102.00 3rd Qu.: 1.7259
## Max. :178.690 Max. :295.0 Max. :262.00 Max. :12.7741
## NA's :715 NA's :715 NA's :715
## Temp_max Temp_min TroponinI_diff TroponinI_max
## Min. :35.40 Min. :24.20 Min. : 0.1571 Min. : 0.30
## 1st Qu.:37.10 1st Qu.:35.60 1st Qu.: 4.6429 1st Qu.: 2.60
## Median :37.60 Median :36.10 Median : 5.2571 Median : 7.80
## Mean :37.69 Mean :36.01 Mean :10.1737 Mean :11.83
## 3rd Qu.:38.20 3rd Qu.:36.60 3rd Qu.:12.1571 3rd Qu.:17.60
## Max. :42.10 Max. :38.30 Max. :37.9571 Max. :43.40
##
## TroponinI_min TroponinT_diff TroponinT_max TroponinT_min
## Min. : 0.30 Min. : 0.0215 Min. : 0.0100 Min. : 0.0100
## 1st Qu.: 1.30 1st Qu.: 0.5785 1st Qu.: 0.0600 1st Qu.: 0.0400
## Median : 6.80 Median : 0.6285 Median : 0.1700 Median : 0.1200
## Mean :10.06 Mean : 1.0920 Mean : 0.9079 Mean : 0.6347
## 3rd Qu.:13.20 3rd Qu.: 0.6585 3rd Qu.: 0.8000 3rd Qu.: 0.4700
## Max. :42.90 Max. :23.7915 Max. :24.4600 Max. :22.9300
##
## Urine_diff Urine_max Urine_min WBC_diff
## Min. : 19.22 Min. : 0.0 Min. : 0.00 Min. : 0.03315
## 1st Qu.: 100.78 1st Qu.: 200.0 1st Qu.: 0.00 1st Qu.: 2.63315
## Median : 300.78 Median : 400.0 Median : 20.00 Median : 4.53315
## Mean : 438.25 Mean : 521.8 Mean : 34.55 Mean : 5.82079
## 3rd Qu.: 525.78 3rd Qu.: 625.0 3rd Qu.: 36.00 3rd Qu.: 7.23315
## Max. :4900.78 Max. :5000.0 Max. :600.00 Max. :143.46685
##
## WBC_max WBC_min Weight_diff Weight_max
## Min. : 0.10 Min. : 0.10 Min. : 0.00012 Min. : 34.60
## 1st Qu.: 9.30 1st Qu.: 7.60 1st Qu.: 7.60000 1st Qu.: 66.00
## Median : 12.30 Median : 10.40 Median : 14.70012 Median : 80.00
## Mean : 13.95 Mean : 11.51 Mean : 18.17040 Mean : 82.66
## 3rd Qu.: 16.90 3rd Qu.: 14.10 3rd Qu.: 24.80000 3rd Qu.: 94.55
## Max. :155.60 Max. :128.30 Max. :149.30012 Max. :230.00
## NA's :146 NA's :146
## Weight_min
## Min. : 34.60
## 1st Qu.: 65.00
## Median : 77.70
## Mean : 80.86
## 3rd Qu.: 91.95
## Max. :230.00
## NA's :146
We had a look at all the variables in the dataset before making a selection of 9 variables which we will investigate.
We selected the predictor variables (age, length of stay, ICU type, SOFA score, maximum fractional inspired oxygen (FiO2), minimum Glasgow Coma Score (GCS), maximum heart rate and minimum respiration rate) to predict the risk of in-hospital death.
An interesting finding with the summary statistics is that there might be issues with the data quality. As seen in the summary statistics of length of stay and SOFA scores, the minimum value is negative which is not possible as length of stay cannot be negative and SOFA scores range from 0-24.
Background Research supporting the selection of predictor variables
Based on the article “Relationship between age and in-hospital mortality during 15,345,025 non-surgical hospitalizations” from the Archives of Medical Science, the findings support in hospital death to be associated with the age of the patients. In this study, older patients have a greater odds of dying in hospital than younger patients.
Another article, “Patient length of stay and mortality prediction: A survey” from PubMed suggests that the longer length of stay in hospital is a parameter linked to the severity of the health conditions, therefore may be associated with in hospital mortality.
In this article, “Infection as an independent risk factor for mortality in the surgical intensive care unit” from National Library of Medicine which evaluated mortality in hospital from surgical and medical ICU, points out that certain types of ICU are associated with high in hospital mortality.
Additionally, “Prognostic Accuracy of the SOFA Score, SIRS Criteria, and qSOFA Score for In-Hospital Mortality Among Adults With Suspected Infection Admitted to the Intensive Care Unit” from PubMed propose that high SOFA scores are associated with higher in hospital mortality.
According to “Severity of respiratory failure at admission and in-hospital mortality in patients with COVID-19: a prospective observational multicentre study” from BMJ reports high FiO2 is independently associated with in-hospital mortality.
In addition, “In-hospital mortality and the Glasgow Coma Scale in the first 72 hours after traumatic brain injury” from SciELO suggests that low GCS scores are associated with in-hospital mortality.
Furthermore, “Heart rate at admission is a predictor of in-hospital mortality in patients with acute coronary syndromes” from PubMed propose that high heart rate can be associated with in-hospital mortality in certain sub-population of patients.
Lastly, “Mean nocturnal respiratory rate predicts cardiovascular and all-cause mortality in community-dwelling older men and women” from ERS reports that the association between low respiratory rate and in hospital and short term mortality.
# Subsetting the ICU dataset into variables of interest
icu_sub <- icu_patients_df1 %>% dplyr::select(in_hospital_death, ICUType, Age, Length_of_stay, SOFA, FiO2_max, RespRate_min, HR_max, GCS_min)
# Examining the structure of the subset of dataset
str(icu_sub)
## 'data.frame': 2061 obs. of 9 variables:
## $ in_hospital_death: num 0 0 0 0 0 0 0 1 0 0 ...
## $ ICUType : Factor w/ 4 levels "Coronary Care Unit",..: 4 2 3 3 3 1 3 3 3 2 ...
## $ Age : num 54 76 44 68 88 64 68 78 64 74 ...
## $ Length_of_stay : num 5 8 19 9 4 6 9 6 17 8 ...
## $ SOFA : num 1 8 11 1 2 11 4 8 0 6 ...
## $ FiO2_max : num 0.5 1 1 1 0.4 0.5 0.4 1 0.35 1 ...
## $ RespRate_min : num 12 11 18 12 15 20 11 10 19 8 ...
## $ HR_max : num 80 88 113 88 94 91 88 111 137 98 ...
## $ GCS_min : num 15 3 5 14 15 7 15 8 15 10 ...
The subset ICU dataset contains 2061 observations of 9 variables. The outcome variable is death in hospital and the predictor variables are type of ICU, age, length of stay, SOFA scores, maximum FiO2, minimum respiration rate, maximum heart rate and minimum GCS.
# Check the summary of the dataset
summary(icu_sub)
## in_hospital_death ICUType Age
## Min. :0.0000 Coronary Care Unit :297 Min. :16.00
## 1st Qu.:0.0000 Cardiac Surgery Recovery Unit:448 1st Qu.:52.00
## Median :0.0000 Medical ICU :788 Median :67.00
## Mean :0.1441 Surgical ICU :528 Mean :64.41
## 3rd Qu.:0.0000 3rd Qu.:78.00
## Max. :1.0000 Max. :90.00
## Length_of_stay SOFA FiO2_max RespRate_min
## Min. : -1.00 Min. :-1.000 Min. :0.2800 Min. : 4.00
## 1st Qu.: 6.00 1st Qu.: 3.000 1st Qu.:0.5000 1st Qu.:12.00
## Median : 10.00 Median : 6.000 Median :1.0000 Median :14.00
## Mean : 13.74 Mean : 6.441 Mean :0.7874 Mean :14.25
## 3rd Qu.: 17.00 3rd Qu.: 9.000 3rd Qu.:1.0000 3rd Qu.:17.00
## Max. :154.00 Max. :22.000 Max. :1.0000 Max. :24.00
## HR_max GCS_min
## Min. : 44.0 Min. : 3.000
## 1st Qu.: 91.0 1st Qu.: 3.000
## Median :104.0 Median : 8.000
## Mean :106.6 Mean : 8.773
## 3rd Qu.:119.0 3rd Qu.:14.000
## Max. :300.0 Max. :15.000
# Check whether there is any missing data with the subset of variables
as.data.frame(colSums(! is.na(icu_sub)))
## colSums(!is.na(icu_sub))
## in_hospital_death 2061
## ICUType 2061
## Age 2061
## Length_of_stay 2061
## SOFA 2061
## FiO2_max 2061
## RespRate_min 2061
## HR_max 2061
## GCS_min 2061
# Print the names of variables with incomplete data
names(which(colSums(! is.na(icu_sub)) < 2061))
## character(0)
# Turn in_hospital_death variable to factor variables
icu_sub$in_hospital_death <- as.factor(icu_sub$in_hospital_death)
The subset of ICU dataset has no missing data.
# for loop to produce all the plots to compare the distribution
names <- colnames(icu_sub[-c(1:2)])
plots <- for (column in names) {
print(ggplot(data = icu_sub, aes(x = icu_sub[,column], fill = in_hospital_death))+ geom_histogram() + theme_minimal() + scale_x_continuous(expand = c(0,0)) + scale_y_continuous(expand = c(0,0)) + ggtitle(paste("Histogram of", column)) + labs(x = column, fill = "Deaths in Hospital") + facet_grid(in_hospital_death~., scales = "fixed") + scale_fill_discrete(name = "Deaths in Hospital", labels = c("Survivor", "Died in Hospital")))
}
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Looking at the histograms, it is evident that majority of the patients did not die in the hospital and the distribution for all predictor variables except type of ICU for survivor and died in hospital are very similar. An interesting observation is that for the variable minimum respiration rate, the mean respiration rate is higher in those that died in hospital than those who survived. Another observation which aligns with the research is the predictor variable age, the mean age of patients is higher in the cohort that died in the hospital than those who survived.
# Fitting univariate logistic regression models
age_glm <- glm(in_hospital_death ~ Age, family = binomial, data = icu_sub)
summary(age_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Age, family = binomial, data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.7522 -0.6264 -0.5111 -0.3919 2.5135
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.761624 0.303337 -12.401 < 2e-16 ***
## Age 0.029376 0.004229 6.947 3.73e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1644.9 on 2059 degrees of freedom
## AIC: 1648.9
##
## Number of Fisher Scoring iterations: 5
Icu_type_glm <- glm(in_hospital_death ~ ICUType, family = binomial, data = icu_sub)
summary(Icu_type_glm)
##
## Call:
## glm(formula = in_hospital_death ~ ICUType, family = binomial,
## data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6402 -0.6402 -0.5615 -0.3458 2.3861
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.6463 0.1576 -10.443 < 2e-16 ***
## ICUTypeCardiac Surgery Recovery Unit -1.1407 0.2563 -4.451 8.55e-06 ***
## ICUTypeMedical ICU 0.1653 0.1824 0.906 0.365
## ICUTypeSurgical ICU -0.1214 0.2001 -0.607 0.544
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1655.3 on 2057 degrees of freedom
## AIC: 1663.3
##
## Number of Fisher Scoring iterations: 5
stay_glm <- glm(in_hospital_death ~ Length_of_stay, family = binomial, data = icu_sub)
summary(stay_glm)
##
## Call:
## glm(formula = in_hospital_death ~ Length_of_stay, family = binomial,
## data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.8655 -0.5594 -0.5466 -0.5413 2.0058
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.882249 0.089155 -21.112 <2e-16 ***
## Length_of_stay 0.007099 0.004331 1.639 0.101
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1697.2 on 2059 degrees of freedom
## AIC: 1701.2
##
## Number of Fisher Scoring iterations: 4
SOFA_glm <- glm(in_hospital_death ~ SOFA, family = binomial, data = icu_sub)
summary(SOFA_glm)
##
## Call:
## glm(formula = in_hospital_death ~ SOFA, family = binomial, data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.0747 -0.5835 -0.4771 -0.3623 2.4609
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.83453 0.14090 -20.117 <2e-16 ***
## SOFA 0.14378 0.01539 9.342 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1607.5 on 2059 degrees of freedom
## AIC: 1611.5
##
## Number of Fisher Scoring iterations: 5
FiO2_glm <- glm(in_hospital_death ~ FiO2_max, family = binomial, data = icu_sub)
summary(FiO2_glm)
##
## Call:
## glm(formula = in_hospital_death ~ FiO2_max, family = binomial,
## data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.5634 -0.5634 -0.5555 -0.5504 1.9898
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.8614 0.2102 -8.856 <2e-16 ***
## FiO2_max 0.1011 0.2533 0.399 0.69
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1699.5 on 2059 degrees of freedom
## AIC: 1703.5
##
## Number of Fisher Scoring iterations: 4
RespRate_glm <- glm(in_hospital_death ~ RespRate_min, family = binomial, data = icu_sub)
summary(RespRate_glm)
##
## Call:
## glm(formula = in_hospital_death ~ RespRate_min, family = binomial,
## data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.7929 -0.5872 -0.5222 -0.4636 2.3445
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.01958 0.25802 -11.703 < 2e-16 ***
## RespRate_min 0.08432 0.01656 5.091 3.57e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1673.6 on 2059 degrees of freedom
## AIC: 1677.6
##
## Number of Fisher Scoring iterations: 4
HR_glm <- glm(in_hospital_death ~ HR_max, family = binomial, data = icu_sub)
summary(HR_glm)
##
## Call:
## glm(formula = in_hospital_death ~ HR_max, family = binomial,
## data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1194 -0.5733 -0.5402 -0.5067 2.1517
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.707555 0.303251 -8.928 < 2e-16 ***
## HR_max 0.008565 0.002707 3.164 0.00156 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1689.9 on 2059 degrees of freedom
## AIC: 1693.9
##
## Number of Fisher Scoring iterations: 4
GCS_glm <- glm(in_hospital_death ~ GCS_min, family = binomial, data = icu_sub)
summary(GCS_glm)
##
## Call:
## glm(formula = in_hospital_death ~ GCS_min, family = binomial,
## data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6238 -0.6238 -0.5394 -0.4853 2.0964
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.40261 0.12298 -11.405 < 2e-16 ***
## GCS_min -0.04514 0.01317 -3.426 0.000612 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1687.7 on 2059 degrees of freedom
## AIC: 1691.7
##
## Number of Fisher Scoring iterations: 4
# Null model
null <- glm(in_hospital_death ~ 1, family = binomial, data = icu_sub)
# for loop to compare null model against each individual univariate logistic regression models
glm_mod <- list(age_glm, Icu_type_glm, stay_glm, SOFA_glm, FiO2_glm, RespRate_glm, HR_glm, GCS_glm)
# Comparing different models
for (model in glm_mod) {
print(anova(null, model, test = "Chi"))
}
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ 1
## Model 2: in_hospital_death ~ Age
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2060 1699.7
## 2 2059 1644.9 1 54.756 1.364e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ 1
## Model 2: in_hospital_death ~ ICUType
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2060 1699.7
## 2 2057 1655.3 3 44.388 1.248e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ 1
## Model 2: in_hospital_death ~ Length_of_stay
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2060 1699.7
## 2 2059 1697.2 1 2.51 0.1131
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ 1
## Model 2: in_hospital_death ~ SOFA
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2060 1699.7
## 2 2059 1607.5 1 92.151 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ 1
## Model 2: in_hospital_death ~ FiO2_max
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2060 1699.7
## 2 2059 1699.5 1 0.15956 0.6896
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ 1
## Model 2: in_hospital_death ~ RespRate_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2060 1699.7
## 2 2059 1673.6 1 26.13 3.193e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ 1
## Model 2: in_hospital_death ~ HR_max
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2060 1699.7
## 2 2059 1689.9 1 9.7436 0.001799 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ 1
## Model 2: in_hospital_death ~ GCS_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2060 1699.7
## 2 2059 1687.7 1 11.957 0.0005445 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the analysis of deviance, it seems that the age, type of ICU, SOFA scores, minimum respiration rate, maximum heart rate and low GCS are significant predictors compared to the null model as their p-values are below 0.05.
# Using AIC for model selection
full_mod <- glm(in_hospital_death~., family = binomial, data = icu_sub)
aic_suggest_mod <- step(full_mod, trace = 1)
## Start: AIC=1486.25
## in_hospital_death ~ ICUType + Age + Length_of_stay + SOFA + FiO2_max +
## RespRate_min + HR_max + GCS_min
##
## Df Deviance AIC
## - FiO2_max 1 1464.3 1484.3
## - GCS_min 1 1464.4 1484.4
## - Length_of_stay 1 1464.4 1484.4
## <none> 1464.2 1486.2
## - HR_max 1 1466.3 1486.3
## - RespRate_min 1 1467.1 1487.1
## - ICUType 3 1526.5 1542.5
## - Age 1 1525.1 1545.1
## - SOFA 1 1533.0 1553.0
##
## Step: AIC=1484.27
## in_hospital_death ~ ICUType + Age + Length_of_stay + SOFA + RespRate_min +
## HR_max + GCS_min
##
## Df Deviance AIC
## - GCS_min 1 1464.4 1482.4
## - Length_of_stay 1 1464.5 1482.5
## <none> 1464.3 1484.3
## - HR_max 1 1466.3 1484.3
## - RespRate_min 1 1467.2 1485.2
## - Age 1 1525.1 1543.1
## - ICUType 3 1529.3 1543.3
## - SOFA 1 1534.0 1552.0
##
## Step: AIC=1482.4
## in_hospital_death ~ ICUType + Age + Length_of_stay + SOFA + RespRate_min +
## HR_max
##
## Df Deviance AIC
## - Length_of_stay 1 1464.6 1480.6
## - HR_max 1 1466.4 1482.4
## <none> 1464.4 1482.4
## - RespRate_min 1 1467.2 1483.2
## - Age 1 1526.6 1542.6
## - ICUType 3 1539.6 1551.6
## - SOFA 1 1568.3 1584.3
##
## Step: AIC=1480.59
## in_hospital_death ~ ICUType + Age + SOFA + RespRate_min + HR_max
##
## Df Deviance AIC
## - HR_max 1 1466.5 1480.5
## <none> 1464.6 1480.6
## - RespRate_min 1 1467.4 1481.4
## - Age 1 1527.0 1541.0
## - ICUType 3 1539.6 1549.6
## - SOFA 1 1569.7 1583.7
##
## Step: AIC=1480.48
## in_hospital_death ~ ICUType + Age + SOFA + RespRate_min
##
## Df Deviance AIC
## <none> 1466.5 1480.5
## - RespRate_min 1 1470.0 1482.0
## - Age 1 1527.1 1539.1
## - ICUType 3 1545.5 1553.5
## - SOFA 1 1576.7 1588.7
summary(aic_suggest_mod)
##
## Call:
## glm(formula = in_hospital_death ~ ICUType + Age + SOFA + RespRate_min,
## family = binomial, data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.4776 -0.5750 -0.3988 -0.2493 2.9832
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.556084 0.476775 -11.653 < 2e-16 ***
## ICUTypeCardiac Surgery Recovery Unit -1.554349 0.270838 -5.739 9.52e-09 ***
## ICUTypeMedical ICU 0.218533 0.196469 1.112 0.2660
## ICUTypeSurgical ICU 0.001326 0.215072 0.006 0.9951
## Age 0.032684 0.004464 7.321 2.45e-13 ***
## SOFA 0.167784 0.016551 10.137 < 2e-16 ***
## RespRate_min 0.033880 0.017999 1.882 0.0598 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1466.5 on 2054 degrees of freedom
## AIC: 1480.5
##
## Number of Fisher Scoring iterations: 5
The AIC suggested models only contain four predictor variables which are type of ICU, age, SOFA score and minimum respiration rate. Although this is the best model according to AIC indication, we still wanted to include maximum heart rate and minimum GCS and complete 3 more tests of analysis of deviance to compare the models.
# analysis of deviance
test_mod_glm1 <- glm(in_hospital_death ~ ICUType + Age + SOFA + RespRate_min + HR_max + GCS_min,
family = binomial, data = icu_sub)
print(anova(aic_suggest_mod, test_mod_glm1, test = "Chi"))
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ ICUType + Age + SOFA + RespRate_min
## Model 2: in_hospital_death ~ ICUType + Age + SOFA + RespRate_min + HR_max +
## GCS_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2054 1466.5
## 2 2052 1464.5 2 2.0205 0.3641
# analysis of deviance
test_mod_glm2 <- glm(in_hospital_death ~ ICUType + Age + SOFA + RespRate_min + GCS_min,
family = binomial, data = icu_sub)
print(anova(aic_suggest_mod, test_mod_glm2, test = "Chi"))
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ ICUType + Age + SOFA + RespRate_min
## Model 2: in_hospital_death ~ ICUType + Age + SOFA + RespRate_min + GCS_min
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2054 1466.5
## 2 2053 1466.4 1 0.12025 0.7288
# analysis of deviance
test_mod_glm3 <- glm(in_hospital_death ~ ICUType + Age + SOFA + RespRate_min + HR_max,
family = binomial, data = icu_sub)
print(anova(aic_suggest_mod, test_mod_glm3, test = "Chi"))
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ ICUType + Age + SOFA + RespRate_min
## Model 2: in_hospital_death ~ ICUType + Age + SOFA + RespRate_min + HR_max
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2054 1466.5
## 2 2053 1464.6 1 1.8884 0.1694
The 3 analysis of deviance test suggest there are only 4 significant predictors of deaths in hospital as suggested by AIC and these are ICU type, age, SOFA scores and minimum respiration rate.
# Checking if the reduced model is better fitted with interactions according to the AIC
(reduced_interactions <-stepAIC(aic_suggest_mod, direction="both", scope=list(lower=~1, upper=~.^2), trace=F, data=icu_sub))
##
## Call: glm(formula = in_hospital_death ~ ICUType + Age + SOFA + RespRate_min +
## ICUType:SOFA + Age:SOFA, family = binomial, data = icu_sub)
##
## Coefficients:
## (Intercept)
## -7.167975
## ICUTypeCardiac Surgery Recovery Unit
## 0.520177
## ICUTypeMedical ICU
## 0.686091
## ICUTypeSurgical ICU
## 0.750526
## Age
## 0.048358
## SOFA
## 0.374769
## RespRate_min
## 0.032958
## ICUTypeCardiac Surgery Recovery Unit:SOFA
## -0.234805
## ICUTypeMedical ICU:SOFA
## -0.067856
## ICUTypeSurgical ICU:SOFA
## -0.101957
## Age:SOFA
## -0.001929
##
## Degrees of Freedom: 2060 Total (i.e. Null); 2050 Residual
## Null Deviance: 1700
## Residual Deviance: 1453 AIC: 1475
summary(reduced_interactions)
##
## Call:
## glm(formula = in_hospital_death ~ ICUType + Age + SOFA + RespRate_min +
## ICUType:SOFA + Age:SOFA, family = binomial, data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.4704 -0.5761 -0.3853 -0.2504 3.0985
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.167975 0.840892 -8.524 < 2e-16
## ICUTypeCardiac Surgery Recovery Unit 0.520177 0.692867 0.751 0.45280
## ICUTypeMedical ICU 0.686091 0.417913 1.642 0.10065
## ICUTypeSurgical ICU 0.750526 0.456305 1.645 0.10001
## Age 0.048358 0.009578 5.049 4.45e-07
## SOFA 0.374769 0.089250 4.199 2.68e-05
## RespRate_min 0.032958 0.018099 1.821 0.06861
## ICUTypeCardiac Surgery Recovery Unit:SOFA -0.234805 0.076248 -3.080 0.00207
## ICUTypeMedical ICU:SOFA -0.067856 0.049094 -1.382 0.16693
## ICUTypeSurgical ICU:SOFA -0.101957 0.053530 -1.905 0.05682
## Age:SOFA -0.001929 0.001039 -1.857 0.06336
##
## (Intercept) ***
## ICUTypeCardiac Surgery Recovery Unit
## ICUTypeMedical ICU
## ICUTypeSurgical ICU
## Age ***
## SOFA ***
## RespRate_min .
## ICUTypeCardiac Surgery Recovery Unit:SOFA **
## ICUTypeMedical ICU:SOFA
## ICUTypeSurgical ICU:SOFA .
## Age:SOFA .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1453.3 on 2050 degrees of freedom
## AIC: 1475.3
##
## Number of Fisher Scoring iterations: 6
After checking the summary statistics for the logistic regression model including the 2 interactions, the age:SOFA interaction have a p-value above 0.05, which means that it is not statistically significant. Therefore, it is reasonable to drop this interaction.
# Refitting the model with one interaction term
reduced_interactions2 <- glm(in_hospital_death ~ ICUType + Age + SOFA + RespRate_min +
ICUType:SOFA, family = binomial, data = icu_sub)
summary(reduced_interactions2)
##
## Call:
## glm(formula = in_hospital_death ~ ICUType + Age + SOFA + RespRate_min +
## ICUType:SOFA, family = binomial, data = icu_sub)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.5185 -0.5595 -0.3868 -0.2659 2.9280
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.032598 0.549452 -10.979 < 2e-16
## ICUTypeCardiac Surgery Recovery Unit 0.470833 0.690432 0.682 0.49528
## ICUTypeMedical ICU 0.570995 0.412816 1.383 0.16661
## ICUTypeSurgical ICU 0.648873 0.451775 1.436 0.15092
## Age 0.033045 0.004488 7.364 1.79e-13
## SOFA 0.230223 0.042749 5.385 7.22e-08
## RespRate_min 0.034001 0.018147 1.874 0.06098
## ICUTypeCardiac Surgery Recovery Unit:SOFA -0.229240 0.076080 -3.013 0.00259
## ICUTypeMedical ICU:SOFA -0.050096 0.048208 -1.039 0.29873
## ICUTypeSurgical ICU:SOFA -0.087938 0.053034 -1.658 0.09729
##
## (Intercept) ***
## ICUTypeCardiac Surgery Recovery Unit
## ICUTypeMedical ICU
## ICUTypeSurgical ICU
## Age ***
## SOFA ***
## RespRate_min .
## ICUTypeCardiac Surgery Recovery Unit:SOFA **
## ICUTypeMedical ICU:SOFA
## ICUTypeSurgical ICU:SOFA .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1699.7 on 2060 degrees of freedom
## Residual deviance: 1456.8 on 2051 degrees of freedom
## AIC: 1476.8
##
## Number of Fisher Scoring iterations: 5
# Conducting an analysis of deviance
print(anova(aic_suggest_mod, reduced_interactions2, test = "Chi"))
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ ICUType + Age + SOFA + RespRate_min
## Model 2: in_hospital_death ~ ICUType + Age + SOFA + RespRate_min + ICUType:SOFA
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2054 1466.5
## 2 2051 1456.8 3 9.673 0.02156 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
It seems that the model including the interaction term is preferred over the model without the interaction as shown by the p-value of 0.02156.
# Checking the model fit using binned residual plot
binnedplot(predict(reduced_interactions2), residuals(aic_suggest_mod))
The variance of the residuals in the binned residual plot seems to be constant and evenly distributed.
options(scipen=999)
# Checking the fit of the reduced model using Pseudo-R2
PseudoR2(aic_suggest_mod, which = "all")
## McFadden McFaddenAdj CoxSnell Nagelkerke AldrichNelson
## 0.1372089 0.1289721 0.1069877 0.1904951 0.1016525
## VeallZimmermann Efron McKelveyZavoina Tjur AIC
## 0.2249137 0.1320017 0.2560243 0.1286316 1480.4757799
## BIC logLik logLik0 G2
## 1519.8924060 -733.2378900 -849.8440447 233.2123095
options(scipen=999)
# Checking the fit of the full model using Pseudo-R2
PseudoR2(reduced_interactions2, which = "all")
## McFadden McFaddenAdj CoxSnell Nagelkerke AldrichNelson
## 0.1428999 0.1311330 0.1111691 0.1979402 0.1054242
## VeallZimmermann Efron McKelveyZavoina Tjur AIC
## 0.2332590 0.1352942 0.2413472 0.1350906 1476.8028164
## BIC logLik logLik0 G2
## 1533.1122822 -728.4014082 -849.8440447 242.8852730
The Pseudo-R2 is slightly better in the model with the interaction compared to the one without as the Pseudo R2 values are slightly larger meaning that the model with interaction has a slight better fit.
# Changing the variable into numeric so doesn't run into error when knitting
icu_sub$in_hospital_death <- as.numeric(icu_sub$in_hospital_death)
# Checking the goodness of fit of reduced model using Brier Score test
pred_prob_reduced <- predict(aic_suggest_mod, type = "response")
brier_score_reduced <- mean((pred_prob_reduced - icu_sub$in_hospital_death)^2)
brier_score_reduced
## [1] 1.107058
# Checking the goodness of fit of reduced model with interaction using Brier Score test
pred_prob_reduced_interactions2 <- predict(reduced_interactions2, type = "response")
brier_score_reduced <- mean((pred_prob_reduced_interactions2 - icu_sub$in_hospital_death)^2)
brier_score_reduced
## [1] 1.106652
The difference in Brier scores between the two is very small, therefore the reduced model with interaction is preferred based on the analysis of deviance conducted earlier.
unimputed <- glm(in_hospital_death ~ ICUType + Age + SOFA + RespRate_min +
ICUType:SOFA, family = binomial, data = icu_patients_df0)
summary(unimputed)
##
## Call:
## glm(formula = in_hospital_death ~ ICUType + Age + SOFA + RespRate_min +
## ICUType:SOFA, family = binomial, data = icu_patients_df0)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.2449 -0.4108 -0.2805 -0.1933 3.2034
##
## Coefficients:
## Estimate Std. Error z value
## (Intercept) -7.23566 1.28073 -5.650
## ICUTypeCardiac Surgery Recovery Unit 2.07686 1.39224 1.492
## ICUTypeMedical ICU 0.59752 0.87606 0.682
## ICUTypeSurgical ICU 1.18057 1.10814 1.065
## Age 0.03403 0.01170 2.908
## SOFA 0.32690 0.13674 2.391
## RespRate_min 0.05588 0.04792 1.166
## ICUTypeCardiac Surgery Recovery Unit:SOFA -0.29536 0.26295 -1.123
## ICUTypeMedical ICU:SOFA -0.06000 0.15122 -0.397
## ICUTypeSurgical ICU:SOFA -0.37578 0.28639 -1.312
## Pr(>|z|)
## (Intercept) 0.0000000161 ***
## ICUTypeCardiac Surgery Recovery Unit 0.13577
## ICUTypeMedical ICU 0.49521
## ICUTypeSurgical ICU 0.28671
## Age 0.00363 **
## SOFA 0.01682 *
## RespRate_min 0.24350
## ICUTypeCardiac Surgery Recovery Unit:SOFA 0.26134
## ICUTypeMedical ICU:SOFA 0.69151
## ICUTypeSurgical ICU:SOFA 0.18947
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 317.19 on 554 degrees of freedom
## Residual deviance: 266.33 on 545 degrees of freedom
## (1506 observations deleted due to missingness)
## AIC: 286.33
##
## Number of Fisher Scoring iterations: 6
# calculate p-value
(p_unimputed <- 1 - pchisq(unimputed$null.deviance - unimputed$deviance, unimputed$df.null - unimputed$df.residual))
## [1] 0.00000007440371
# Changing the variable into factor so doesn't run into error when knitting
icu_sub$in_hospital_death <- as.factor(icu_sub$in_hospital_death)
reduced_null <-glm(in_hospital_death ~ 1, family = binomial, data = icu_sub)
# calculate p-value
print(anova(reduced_null, reduced_interactions2, test = "Chi"))
## Analysis of Deviance Table
##
## Model 1: in_hospital_death ~ 1
## Model 2: in_hospital_death ~ ICUType + Age + SOFA + RespRate_min + ICUType:SOFA
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 2060 1699.7
## 2 2051 1456.8 9 242.88 < 0.00000000000000022 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Looking through the summary statistics and p-values of the imputed and the unimputed datasets, the beta estimates of the predictor variables altered slightly, and the ICU (Cardiac Surgery Recovery Unit) and SOFA score interaction became insignificant in the unimputed dataset. The AIC for the unimputed dataset is 286.33 which is much better than the AIC for the imputed dataset of 1475.3. This means that the model is a better fit for the unimputed dataset.
Summary Findings
After comparing different models, the final model we have fitted contains the predictor variables (type of ICU, age, SOFA scores and minimum respiration rate) and an interaction between type of ICU and SOFA scores. The p-value is 0 if rounded to 3 decimal places which suggests that the model is better than the null model.
Through the analysis of deviance, it is apparent that the model with an interaction is the preferred model over the one without as shown by the p-value of 0.02156. This is also confirmed by the Pseudo R2 values. Although the Brier Scores show the model without the interaction has a slightly better fit, but the differences is negligible if rounded to 3 decimal places where both values would be 1.107. The binned residual plot also did not raise any concerns as the variance of the residuals in the binned residual plot seems to be constant and evenly distributed. In conclusion, the findings of this study suggests that the in-hospital mortality rate is associated with type of ICU, age of the patient, SOFA scores and minimum respiration rate out of the initial subset of the predictor variables that were selected.
In this task, you are required to develop a Cox proportional hazards survival model using the icu_patients_df1 data set which adequately explains or predicts the length of survival indicated by the Days variable, with censoring as indicated by the Status variable. You should fit a series of models, maybe three or four, evaluating each one, before you present your final model. Your final model should not include all the predictor variables, just a small subset of them, which you have selected based on statistical significance and/or background knowledge. Aim for between five and ten predictor variables (slightly more or fewer is OK). It is perfectly acceptable to include predictor variables in your final model which are not statistically significant, as long as you justify their inclusion on medical or physiological grounds (you will not be marked down if your medical justification is not exactly correct, but do you best). You should assess each model you consider for goodness of fit and other relevant statistics, and you should assess your final model for violations of assumptions and perform other diagnostics which you think are relevant (and modify the model if indicated, or at least comment on the possible impact of what your diagnostics show). Finally, re-fit your final model to the unimputed data frame (icu_patients_df0.rds) and comment on any differences you find.
head(icu_patients_df1)
## RecordID Length_of_stay SAPS1 SOFA Survival in_hospital_death Days Status Age
## 1 132539 5 6 1 NA 0 2408 FALSE 54
## 2 132540 8 16 8 NA 0 2408 FALSE 76
## 3 132541 19 21 11 NA 0 2408 FALSE 44
## 4 132543 9 7 1 575 0 575 TRUE 68
## 5 132545 4 17 2 918 0 918 TRUE 88
## 6 132547 6 14 11 1637 0 1637 TRUE 64
## Albumin_diff Albumin_max Albumin_min ALP_diff ALP_max ALP_min ALT_diff
## 1 0.2186633 3.2 3.1 118.147964 214 202 80.44617
## 2 0.8813367 2.1 2.2 252.147964 338 348 94.44617
## 3 0.6813367 2.7 2.3 31.147964 127 105 45.44617
## 4 1.4186633 4.4 4.4 9.147964 105 105 108.44617
## 5 0.3813367 2.7 2.6 56.852036 39 78 96.44617
## 6 0.4186633 3.4 3.3 5.147964 101 101 75.44617
## ALT_max ALT_min AST_diff AST_max AST_min Bilirubin_diff Bilirubin_max
## 1 40 75 131.35271 38 53 1.464039 0.4
## 2 206 26 116.35271 53 74 1.564039 1.2
## 3 91 75 65.64729 235 164 1.235961 3.0
## 4 12 12 154.35271 15 15 1.564039 0.2
## 5 24 32 154.35271 15 97 1.364039 0.4
## 6 60 45 122.35271 162 47 1.364039 0.4
## Bilirubin_min BUN_diff BUN_max BUN_min Cholesterol_diff Cholesterol_max
## 1 0.3 11.527053 13 13 16.42276 154
## 2 0.2 8.527053 18 16 28.42276 139
## 3 2.8 21.527053 8 3 56.42276 111
## 4 0.2 4.527053 23 20 37.42276 127
## 5 0.9 20.472947 45 45 55.42276 104
## 6 0.4 9.527053 19 15 55.57724 212
## Cholesterol_min Creatinine_diff Creatinine_max Creatinine_min DiasABP_diff
## 1 140 0.4324463 0.8 0.8 NA
## 2 128 0.4324463 1.2 0.8 26.54421
## 3 100 0.9324463 0.4 0.3 NA
## 4 119 0.5324463 0.9 0.7 NA
## 5 101 0.2324463 1.0 1.0 NA
## 6 212 0.3324463 1.4 0.9 20.45579
## DiasABP_max DiasABP_min FiO2_diff FiO2_max FiO2_min GCS_diff GCS_max GCS_min
## 1 NA NA 0.05192012 0.5 0.5 3.755971 15 15
## 2 81 32 0.44807988 1.0 0.4 8.244029 15 3
## 3 NA NA 0.44807988 1.0 0.5 6.244029 8 5
## 4 NA NA 0.44807988 1.0 0.4 3.755971 15 14
## 5 NA NA 0.15192012 0.4 0.5 3.755971 15 15
## 6 79 55 0.05192012 0.5 0.5 4.244029 9 7
## Gender Glucose_diff Glucose_max Glucose_min HCO3_diff HCO3_max HCO3_min
## 1 Female 65.14446 205 205 3.227452 26 26
## 2 Male 34.85554 105 105 1.772548 22 21
## 3 Female 20.85554 141 119 3.227452 26 24
## 4 Male 33.85554 129 106 5.227452 28 27
## 5 Female 26.85554 113 113 4.772548 18 18
## 6 Male 124.14446 264 197 3.772548 19 19
## HCT_diff HCT_max HCT_min Height HR_diff HR_max HR_min
## 1 2.739871 33.7 33.5 NA 29.077891 80 58
## 2 6.260129 29.7 24.7 175.3 7.077891 88 80
## 3 4.260129 28.5 26.7 NA 30.077891 113 57
## 4 10.339871 41.3 36.1 180.3 30.077891 88 57
## 5 8.360129 30.8 22.6 NA 20.077891 94 67
## 6 10.639871 41.6 36.8 180.3 16.077891 91 71
## ICUType K_diff K_max K_min Lactate_diff Lactate_max
## 1 Surgical ICU 0.2647934 4.4 4.4 0.9964037 1.9
## 2 Cardiac Surgery Recovery Unit 0.1647934 4.3 4.3 1.4964037 2.9
## 3 Medical ICU 4.4647934 8.6 3.3 1.4964037 1.9
## 4 Medical ICU 0.1352066 4.2 4.0 1.5964037 1.2
## 5 Medical ICU 1.8647934 6.0 3.8 0.8964037 2.0
## 6 Coronary Care Unit 0.9647934 5.1 3.8 1.8964037 0.9
## Lactate_min MAP_diff MAP_max MAP_min Mg_diff Mg_max Mg_min Na_diff Na_max
## 1 1.8 31.23164 109 56 0.4842982 1.5 1.5 2.2066071 137
## 2 1.3 34.76836 100 43 1.1157018 3.1 1.9 0.2066071 139
## 3 1.3 53.23164 131 71 0.6842982 1.9 1.3 2.2066071 140
## 4 1.5 24.23164 102 72 0.1157018 2.1 2.1 1.7933929 141
## 5 1.9 9.76836 78 68 0.4842982 1.5 1.5 0.7933929 140
## 6 1.3 24.23164 102 62 0.2842982 1.7 1.7 2.2066071 141
## Na_min NIDiasABP_diff NIDiasABP_max NIDiasABP_min NIMAP_diff NIMAP_max
## 1 137 17.49101 65 40 17.04069 92.33
## 2 139 19.49101 65 38 26.38069 86.33
## 3 137 37.50899 95 66 34.28931 110.00
## 4 140 23.50899 81 54 24.98931 100.70
## 5 140 38.50899 96 29 29.98931 105.70
## 6 137 31.50899 89 52 26.58931 102.30
## NIMAP_min NISysABP_diff NISysABP_max NISysABP_min PaCO2_diff PaCO2_max
## 1 58.67 40.30125 157 96 3.335797 37
## 2 49.33 44.69875 129 72 7.335797 41
## 3 83.33 33.30125 150 111 3.335797 37
## 4 73.00 23.30125 140 102 9.335797 38
## 5 63.67 39.30125 156 119 6.335797 34
## 6 61.67 35.69875 129 81 5.335797 45
## PaCO2_min PaO2_diff PaO2_max PaO2_min pH_diff pH_max pH_min Platelets_diff
## 1 38 47.61789 186 111 0.12011376 7.49 7.43 31.23069
## 2 33 286.38211 445 89 0.08011376 7.45 7.34 36.23069
## 3 37 93.61789 65 65 0.14011376 7.51 7.51 117.76931
## 4 31 94.61789 148 64 0.14011376 7.51 7.47 201.23069
## 5 35 80.61789 78 84 0.04011376 7.38 7.41 80.76931
## 6 35 80.61789 101 78 0.07988624 7.40 7.29 86.23069
## Platelets_max Platelets_min RespRate_diff RespRate_max RespRate_min SaO2_diff
## 1 221 221 7.34858 24 12 3.246079
## 2 226 164 16.65142 36 11 1.753921
## 3 84 72 13.65142 33 18 2.246079
## 4 391 315 7.34858 21 12 1.753921
## 5 109 109 6.65142 26 15 3.246079
## 6 276 219 27.65142 47 20 1.246079
## SaO2_max SaO2_min SysABP_diff SysABP_max SysABP_min Temp_diff Temp_max
## 1 98 94 NA NA NA 1.874083 38.1
## 2 99 97 50.3105 135 66 2.474083 37.9
## 3 95 95 NA NA NA 2.025917 39.0
## 4 99 97 NA NA NA 1.874083 36.7
## 5 97 94 NA NA NA 1.174083 37.8
## 6 97 96 43.3105 152 73 1.174083 37.8
## Temp_min TroponinI_diff TroponinI_max TroponinI_min TroponinT_diff
## 1 35.1 5.1429448 1.0 0.3 0.4785006
## 2 34.5 26.2570552 31.7 16.1 0.6485006
## 3 36.7 31.2570552 33.4 36.7 0.8814994
## 4 35.1 0.8570552 5.9 6.3 0.6485006
## 5 35.8 0.1570552 5.6 5.6 0.6085006
## 6 35.8 4.1429448 1.3 1.3 0.6385006
## TroponinT_max TroponinT_min Urine_diff Urine_max Urine_min WBC_diff WBC_max
## 1 0.58 0.19 800.78242 900 30 0.9331524 11.2
## 2 0.43 0.02 670.78242 770 0 4.7331524 13.1
## 3 1.55 1.41 310.78242 410 30 8.4331524 4.2
## 4 0.10 0.02 600.78242 700 100 3.3331524 11.5
## 5 0.06 0.37 83.21758 150 16 8.3331524 3.8
## 6 0.03 0.10 1100.78242 1200 40 11.8668476 24.0
## WBC_min Weight_diff Weight_max Weight_min
## 1 11.2 NA NA NA
## 2 7.4 4.699878 80.6 76.0
## 3 3.7 23.999878 56.7 56.7
## 4 8.8 3.900122 84.6 84.6
## 5 3.8 NA NA NA
## 6 14.4 33.300122 114.0 114.0
The explanatory variables I will use to predict survival will include:
the Age demographic
important vital signs: HR_min, RespRate_min, SaO2_max, NiSysABP_max, Temp_max, GCS_max, Urine_max
important biochemical markers chosen based on clinical experience: Lactate_max, HCT_max, Creatinine_max, Creatinine_max
I will create a subset of the icu_patients_df1 with the above predictor variables, and the outcome variables Days (survival) and Status (censoring).
icu_sub2 <- icu_patients_df1 %>% dplyr::select(Days, Status, Age, HR_max, HR_min, HR_diff, RespRate_max, RespRate_min, RespRate_diff, SaO2_max, SaO2_min, SaO2_diff, SysABP_max, SysABP_min, SysABP_diff, Temp_max, Temp_min, Temp_diff, GCS_max, GCS_min, GCS_diff, Urine_max, Urine_min, Urine_diff, Lactate_max, Lactate_min, Lactate_diff, HCT_max, HCT_min, HCT_diff, Creatinine_max, Creatinine_min, Creatinine_diff, Na_max, Na_min, Na_diff, K_max, K_min, K_diff, TroponinI_max, TroponinI_min, TroponinI_diff)
# nrow(icu_sub2)
# as.data.frame(colSums(is.na(icu_sub2)))
# icu_sub2c <- icu_sub2[!is.na(icu_sub2$SysABP_max),]
icu_sub2c <- na.omit(icu_sub2)
rownames(icu_sub2c) <- NULL
A brief exploratory data analysis is performed on the variables of interest.
predictors <- colnames(icu_sub2[-c(1,2)])
for (predictor in predictors) {
p <- ggplot(icu_sub2, aes(x = icu_sub2[,predictor], fill=Status)) +
geom_density(alpha = 0.3) +
theme_minimal() +
scale_x_continuous(expand = c(0,0)) +
scale_y_continuous(expand = c(0,0)) +
ggtitle(paste("Density Plot of", predictor)) +
labs(x = predictor)
plot(p)
}
## Warning: Removed 715 rows containing non-finite values (stat_density).
## Warning: Removed 715 rows containing non-finite values (stat_density).
## Warning: Removed 715 rows containing non-finite values (stat_density).
summary(icu_sub2)
## Days Status Age HR_max
## Min. : 0 Mode :logical Min. :16.00 Min. : 44.0
## 1st Qu.: 265 FALSE:1288 1st Qu.:52.00 1st Qu.: 91.0
## Median :2408 TRUE :773 Median :67.00 Median :104.0
## Mean :1634 Mean :64.41 Mean :106.6
## 3rd Qu.:2408 3rd Qu.:78.00 3rd Qu.:119.0
## Max. :2408 Max. :90.00 Max. :300.0
##
## HR_min HR_diff RespRate_max RespRate_min
## Min. : 0.00 Min. : 0.9221 Min. :13.00 Min. : 4.00
## 1st Qu.: 61.00 1st Qu.: 20.0779 1st Qu.:24.00 1st Qu.:12.00
## Median : 71.00 Median : 27.0779 Median :27.00 Median :14.00
## Mean : 71.99 Mean : 30.4294 Mean :29.12 Mean :14.25
## 3rd Qu.: 81.00 3rd Qu.: 36.9221 3rd Qu.:33.00 3rd Qu.:17.00
## Max. :126.00 Max. :212.9221 Max. :98.00 Max. :24.00
##
## RespRate_diff SaO2_max SaO2_min SaO2_diff
## Min. : 0.6514 Min. : 75.00 Min. : 33.00 Min. : 0.2461
## 1st Qu.: 7.3486 1st Qu.: 97.00 1st Qu.: 95.00 1st Qu.: 0.7539
## Median : 9.6514 Median : 98.00 Median : 97.00 Median : 1.7539
## Mean :11.6075 Mean : 97.44 Mean : 95.85 Mean : 2.5635
## 3rd Qu.:13.6514 3rd Qu.: 99.00 3rd Qu.: 98.00 3rd Qu.: 3.2461
## Max. :78.6514 Max. :100.00 Max. :100.00 Max. :64.2461
##
## SysABP_max SysABP_min SysABP_diff Temp_max
## Min. : 52.0 Min. : 11.00 Min. : 3.689 Min. :35.40
## 1st Qu.:135.0 1st Qu.: 79.00 1st Qu.: 32.310 1st Qu.:37.10
## Median :149.0 Median : 88.00 Median : 40.690 Median :37.60
## Mean :152.1 Mean : 90.91 Mean : 45.008 Mean :37.69
## 3rd Qu.:167.0 3rd Qu.:102.00 3rd Qu.: 53.690 3rd Qu.:38.20
## Max. :295.0 Max. :262.00 Max. :178.690 Max. :42.10
## NA's :715 NA's :715 NA's :715
## Temp_min Temp_diff GCS_max GCS_min
## Min. :24.20 Min. : 0.1259 Min. : 3.00 Min. : 3.000
## 1st Qu.:35.60 1st Qu.: 0.8741 1st Qu.:11.00 1st Qu.: 3.000
## Median :36.10 Median : 1.2741 Median :15.00 Median : 8.000
## Mean :36.01 Mean : 1.3756 Mean :12.87 Mean : 8.773
## 3rd Qu.:36.60 3rd Qu.: 1.7259 3rd Qu.:15.00 3rd Qu.:14.000
## Max. :38.30 Max. :12.7741 Max. :15.00 Max. :15.000
##
## GCS_diff Urine_max Urine_min Urine_diff
## Min. :0.244 Min. : 0.0 Min. : 0.00 Min. : 19.22
## 1st Qu.:3.756 1st Qu.: 200.0 1st Qu.: 0.00 1st Qu.: 100.78
## Median :3.756 Median : 400.0 Median : 20.00 Median : 300.78
## Mean :5.183 Mean : 521.8 Mean : 34.55 Mean : 438.25
## 3rd Qu.:8.244 3rd Qu.: 625.0 3rd Qu.: 36.00 3rd Qu.: 525.78
## Max. :8.244 Max. :5000.0 Max. :600.00 Max. :4900.78
##
## Lactate_max Lactate_min Lactate_diff HCT_max
## Min. : 0.400 Min. : 0.300 Min. : 0.003596 Min. :21.20
## 1st Qu.: 1.500 1st Qu.: 1.200 1st Qu.: 1.096404 1st Qu.:30.00
## Median : 2.200 Median : 1.600 Median : 1.503596 Median :33.10
## Mean : 2.773 Mean : 1.899 Mean : 1.753380 Mean :33.57
## 3rd Qu.: 3.200 3rd Qu.: 2.200 3rd Qu.: 1.896404 3rd Qu.:36.70
## Max. :29.300 Max. :24.200 Max. :26.503596 Max. :54.40
##
## HCT_min HCT_diff Creatinine_max Creatinine_min
## Min. : 9.00 Min. : 0.06013 Min. : 0.200 Min. : 0.200
## 1st Qu.:26.20 1st Qu.: 2.96013 1st Qu.: 0.800 1st Qu.: 0.700
## Median :29.60 Median : 5.16013 Median : 1.000 Median : 0.900
## Mean :30.08 Mean : 5.70366 Mean : 1.499 Mean : 1.319
## 3rd Qu.:33.70 3rd Qu.: 7.66013 3rd Qu.: 1.500 3rd Qu.: 1.300
## Max. :50.60 Max. :23.43987 Max. :22.000 Max. :14.100
##
## Creatinine_diff Na_max Na_min Na_diff
## Min. : 0.03245 Min. :112.0 Min. : 98 Min. : 0.2066
## 1st Qu.: 0.33245 1st Qu.:137.0 1st Qu.:136 1st Qu.: 1.7934
## Median : 0.53245 Median :140.0 Median :138 Median : 3.2066
## Mean : 0.86298 Mean :139.8 Mean :138 Mean : 4.1146
## 3rd Qu.: 0.73245 3rd Qu.:142.0 3rd Qu.:141 3rd Qu.: 5.2066
## Max. :20.76755 Max. :177.0 Max. :160 Max. :41.2066
##
## K_max K_min K_diff TroponinI_max
## Min. : 2.500 Min. :1.80 Min. : 0.03521 Min. : 0.30
## 1st Qu.: 4.000 1st Qu.:3.50 1st Qu.: 0.33521 1st Qu.: 2.60
## Median : 4.300 Median :3.90 Median : 0.56479 Median : 7.80
## Mean : 4.419 Mean :3.95 Mean : 0.69010 Mean :11.83
## 3rd Qu.: 4.700 3rd Qu.:4.30 3rd Qu.: 0.86479 3rd Qu.:17.60
## Max. :22.900 Max. :6.90 Max. :18.76479 Max. :43.40
##
## TroponinI_min TroponinI_diff
## Min. : 0.30 Min. : 0.1571
## 1st Qu.: 1.30 1st Qu.: 4.6429
## Median : 6.80 Median : 5.2571
## Mean :10.06 Mean :10.1737
## 3rd Qu.:13.20 3rd Qu.:12.1571
## Max. :42.90 Max. :37.9571
##
attach(icu_sub2)
We will now fit some univariate models for these predictors.
# Age, HR_max, HR_min, HR_diff, RespRate_max, RespRate_min, RespRate_diff, SaO2_max, SaO2_min, SaO2_diff, SysABP_max, SysABP_min, SysABP_diff, Temp_max, Temp_min, Temp_diff, GCS_max, Urine_max, Lactate_max, Lactate_min, Lactate_diff, HCT_min, Creatinine_max, Na_max, K_max, K_min, K_diff, TroponinI_max, TroponinI_min, TroponinI_diff
library(survival)
## Warning: package 'survival' was built under R version 4.1.2
##
## Attaching package: 'survival'
## The following objects are masked from 'package:faraway':
##
## rats, solder
surv_object <- Surv(Days, Status)
coxmod_null <- coxph(surv_object ~ 1, data=icu_sub2)
coxmod_age <- coxph(surv_object ~ Age, data=icu_sub2)
coxmod_HR_max <- coxph(surv_object ~ HR_max, data=icu_sub2)
coxmod_HR_min <- coxph(surv_object ~ HR_min, data=icu_sub2)
coxmod_HR_diff <- coxph(surv_object ~ HR_diff, data=icu_sub2)
coxmod_RR_max <- coxph(surv_object ~ RespRate_max, data=icu_sub2)
coxmod_RR_min <- coxph(surv_object ~ RespRate_min, data=icu_sub2)
coxmod_RR_diff <- coxph(surv_object ~ RespRate_diff, data=icu_sub2)
coxmod_O2_max <- coxph(surv_object ~ SaO2_max, data=icu_sub2)
coxmod_O2_min <- coxph(surv_object ~ SaO2_min, data=icu_sub2)
coxmod_O2_diff <- coxph(surv_object ~ SaO2_diff, data=icu_sub2)
coxmod_SBP_max <- coxph(surv_object ~ SysABP_max, data=icu_sub2)
coxmod_SBP_min <- coxph(surv_object ~ SysABP_min, data=icu_sub2)
coxmod_SBP_diff <- coxph(surv_object ~ SysABP_diff, data=icu_sub2)
coxmod_temp_max <- coxph(surv_object ~ Temp_max, data=icu_sub2)
coxmod_temp_min <- coxph(surv_object ~ Temp_min, data=icu_sub2)
coxmod_temp_diff <- coxph(surv_object ~ Temp_diff, data=icu_sub2)
coxmod_GCS_max <- coxph(surv_object ~ GCS_max, data=icu_sub2)
coxmod_GCS_min <- coxph(surv_object ~ GCS_min, data=icu_sub2)
coxmod_GCS_diff <- coxph(surv_object ~ GCS_diff, data=icu_sub2)
coxmod_U_max <- coxph(surv_object ~ Urine_max, data=icu_sub2)
coxmod_U_min <- coxph(surv_object ~ Urine_min, data=icu_sub2)
coxmod_U_diff <- coxph(surv_object ~ Urine_diff, data=icu_sub2)
coxmod_lactate_max <- coxph(surv_object ~ Lactate_max, data=icu_sub2)
coxmod_lactate_min <- coxph(surv_object ~ Lactate_min, data=icu_sub2)
coxmod_lactate_diff <- coxph(surv_object ~ Lactate_diff, data=icu_sub2)
coxmod_HCT_max <- coxph(surv_object ~ HCT_max, data=icu_sub2)
coxmod_HCT_min <- coxph(surv_object ~ HCT_min, data=icu_sub2)
coxmod_HCT_diff <- coxph(surv_object ~ HCT_diff, data=icu_sub2)
coxmod_Cr_max <- coxph(surv_object ~ Creatinine_max, data=icu_sub2)
coxmod_Cr_min <- coxph(surv_object ~ Creatinine_min, data=icu_sub2)
coxmod_Cr_diff <- coxph(surv_object ~ Creatinine_diff, data=icu_sub2)
coxmod_Na_max <- coxph(surv_object ~ Na_max, data=icu_sub2)
coxmod_Na_min <- coxph(surv_object ~ Na_min, data=icu_sub2)
coxmod_Na_diff <- coxph(surv_object ~ Na_diff, data=icu_sub2)
coxmod_K_max <- coxph(surv_object ~ K_max, data=icu_sub2)
coxmod_K_min <- coxph(surv_object ~ K_min, data=icu_sub2)
coxmod_K_diff <- coxph(surv_object ~ K_diff, data=icu_sub2)
coxmod_trp_max <- coxph(surv_object ~ TroponinI_max, data=icu_sub2)
coxmod_trp_min <- coxph(surv_object ~ TroponinI_min, data=icu_sub2)
coxmod_trp_diff <- coxph(surv_object ~ TroponinI_diff, data=icu_sub2)
summary(coxmod_null)
## Call: coxph(formula = surv_object ~ 1, data = icu_sub2)
##
## Null model
## log likelihood= -5731.446
## n= 2061
summary(coxmod_age)
## Call:
## coxph(formula = surv_object ~ Age, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Age 0.03355 1.03412 0.00250 13.42 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Age 1.034 0.967 1.029 1.039
##
## Concordance= 0.646 (se = 0.01 )
## Likelihood ratio test= 209.4 on 1 df, p=<0.0000000000000002
## Wald test = 180.1 on 1 df, p=<0.0000000000000002
## Score (logrank) test = 187 on 1 df, p=<0.0000000000000002
summary(coxmod_HR_max)
## Call:
## coxph(formula = surv_object ~ HR_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## HR_max 0.002779 1.002783 0.001648 1.687 0.0916 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## HR_max 1.003 0.9972 0.9996 1.006
##
## Concordance= 0.515 (se = 0.011 )
## Likelihood ratio test= 2.79 on 1 df, p=0.09
## Wald test = 2.85 on 1 df, p=0.09
## Score (logrank) test = 2.84 on 1 df, p=0.09
summary(coxmod_HR_min)
## Call:
## coxph(formula = surv_object ~ HR_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## HR_min -0.0009841 0.9990164 0.0024165 -0.407 0.684
##
## exp(coef) exp(-coef) lower .95 upper .95
## HR_min 0.999 1.001 0.9943 1.004
##
## Concordance= 0.498 (se = 0.011 )
## Likelihood ratio test= 0.17 on 1 df, p=0.7
## Wald test = 0.17 on 1 df, p=0.7
## Score (logrank) test = 0.17 on 1 df, p=0.7
summary(coxmod_HR_diff)
## Call:
## coxph(formula = surv_object ~ HR_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## HR_diff 0.009142 1.009184 0.002002 4.567 0.00000494 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## HR_diff 1.009 0.9909 1.005 1.013
##
## Concordance= 0.547 (se = 0.011 )
## Likelihood ratio test= 18.04 on 1 df, p=0.00002
## Wald test = 20.86 on 1 df, p=0.000005
## Score (logrank) test = 20.34 on 1 df, p=0.000006
summary(coxmod_RR_max)
## Call:
## coxph(formula = surv_object ~ RespRate_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## RespRate_max 0.01472 1.01483 0.00432 3.407 0.000657 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## RespRate_max 1.015 0.9854 1.006 1.023
##
## Concordance= 0.535 (se = 0.011 )
## Likelihood ratio test= 10.93 on 1 df, p=0.0009
## Wald test = 11.61 on 1 df, p=0.0007
## Score (logrank) test = 11.54 on 1 df, p=0.0007
summary(coxmod_RR_min)
## Call:
## coxph(formula = surv_object ~ RespRate_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## RespRate_min 0.042945 1.043880 0.009451 4.544 0.00000552 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## RespRate_min 1.044 0.958 1.025 1.063
##
## Concordance= 0.555 (se = 0.01 )
## Likelihood ratio test= 20.44 on 1 df, p=0.000006
## Wald test = 20.65 on 1 df, p=0.000006
## Score (logrank) test = 20.67 on 1 df, p=0.000005
summary(coxmod_RR_diff)
## Call:
## coxph(formula = surv_object ~ RespRate_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## RespRate_diff 0.015401 1.015520 0.005107 3.016 0.00257 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## RespRate_diff 1.016 0.9847 1.005 1.026
##
## Concordance= 0.526 (se = 0.011 )
## Likelihood ratio test= 8.4 on 1 df, p=0.004
## Wald test = 9.09 on 1 df, p=0.003
## Score (logrank) test = 9.03 on 1 df, p=0.003
summary(coxmod_O2_max)
## Call:
## coxph(formula = surv_object ~ SaO2_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SaO2_max -0.02069 0.97953 0.01631 -1.268 0.205
##
## exp(coef) exp(-coef) lower .95 upper .95
## SaO2_max 0.9795 1.021 0.9487 1.011
##
## Concordance= 0.513 (se = 0.01 )
## Likelihood ratio test= 1.53 on 1 df, p=0.2
## Wald test = 1.61 on 1 df, p=0.2
## Score (logrank) test = 1.61 on 1 df, p=0.2
summary(coxmod_O2_min)
## Call:
## coxph(formula = surv_object ~ SaO2_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SaO2_min -0.012080 0.987993 0.008003 -1.51 0.131
##
## exp(coef) exp(-coef) lower .95 upper .95
## SaO2_min 0.988 1.012 0.9726 1.004
##
## Concordance= 0.509 (se = 0.01 )
## Likelihood ratio test= 2.03 on 1 df, p=0.2
## Wald test = 2.28 on 1 df, p=0.1
## Score (logrank) test = 2.27 on 1 df, p=0.1
summary(coxmod_O2_diff)
## Call:
## coxph(formula = surv_object ~ SaO2_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SaO2_diff 0.013940 1.014037 0.008367 1.666 0.0957 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## SaO2_diff 1.014 0.9862 0.9975 1.031
##
## Concordance= 0.518 (se = 0.01 )
## Likelihood ratio test= 2.41 on 1 df, p=0.1
## Wald test = 2.78 on 1 df, p=0.1
## Score (logrank) test = 2.77 on 1 df, p=0.1
summary(coxmod_SBP_max)
## Call:
## coxph(formula = surv_object ~ SysABP_max, data = icu_sub2)
##
## n= 1346, number of events= 480
## (715 observations deleted due to missingness)
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SysABP_max 0.003683 1.003690 0.001743 2.113 0.0346 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## SysABP_max 1.004 0.9963 1 1.007
##
## Concordance= 0.52 (se = 0.014 )
## Likelihood ratio test= 4.35 on 1 df, p=0.04
## Wald test = 4.47 on 1 df, p=0.03
## Score (logrank) test = 4.46 on 1 df, p=0.03
summary(coxmod_SBP_min)
## Call:
## coxph(formula = surv_object ~ SysABP_min, data = icu_sub2)
##
## n= 1346, number of events= 480
## (715 observations deleted due to missingness)
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SysABP_min -0.003485 0.996521 0.002457 -1.418 0.156
##
## exp(coef) exp(-coef) lower .95 upper .95
## SysABP_min 0.9965 1.003 0.9917 1.001
##
## Concordance= 0.521 (se = 0.014 )
## Likelihood ratio test= 2.05 on 1 df, p=0.2
## Wald test = 2.01 on 1 df, p=0.2
## Score (logrank) test = 2.01 on 1 df, p=0.2
summary(coxmod_SBP_diff)
## Call:
## coxph(formula = surv_object ~ SysABP_diff, data = icu_sub2)
##
## n= 1346, number of events= 480
## (715 observations deleted due to missingness)
##
## coef exp(coef) se(coef) z Pr(>|z|)
## SysABP_diff 0.009697 1.009745 0.002007 4.832 0.00000135 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## SysABP_diff 1.01 0.9903 1.006 1.014
##
## Concordance= 0.568 (se = 0.013 )
## Likelihood ratio test= 20.97 on 1 df, p=0.000005
## Wald test = 23.35 on 1 df, p=0.000001
## Score (logrank) test = 23.29 on 1 df, p=0.000001
summary(coxmod_temp_max)
## Call:
## coxph(formula = surv_object ~ Temp_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Temp_max -0.1966 0.8215 0.0493 -3.988 0.0000668 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Temp_max 0.8215 1.217 0.7459 0.9049
##
## Concordance= 0.55 (se = 0.011 )
## Likelihood ratio test= 16.34 on 1 df, p=0.00005
## Wald test = 15.9 on 1 df, p=0.00007
## Score (logrank) test = 15.89 on 1 df, p=0.00007
summary(coxmod_temp_min)
## Call:
## coxph(formula = surv_object ~ Temp_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Temp_min -0.10541 0.89996 0.03894 -2.707 0.00679 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Temp_min 0.9 1.111 0.8338 0.9713
##
## Concordance= 0.528 (se = 0.011 )
## Likelihood ratio test= 6.85 on 1 df, p=0.009
## Wald test = 7.33 on 1 df, p=0.007
## Score (logrank) test = 7.21 on 1 df, p=0.007
summary(coxmod_temp_diff)
## Call:
## coxph(formula = surv_object ~ Temp_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Temp_diff 0.09734 1.10224 0.04601 2.116 0.0344 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Temp_diff 1.102 0.9072 1.007 1.206
##
## Concordance= 0.514 (se = 0.011 )
## Likelihood ratio test= 4.16 on 1 df, p=0.04
## Wald test = 4.48 on 1 df, p=0.03
## Score (logrank) test = 4.42 on 1 df, p=0.04
summary(coxmod_GCS_max)
## Call:
## coxph(formula = surv_object ~ GCS_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## GCS_max -0.08204 0.92123 0.01061 -7.731 0.0000000000000107 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## GCS_max 0.9212 1.086 0.9023 0.9406
##
## Concordance= 0.576 (se = 0.01 )
## Likelihood ratio test= 54.07 on 1 df, p=0.0000000000002
## Wald test = 59.76 on 1 df, p=0.00000000000001
## Score (logrank) test = 60.81 on 1 df, p=0.000000000000006
summary(coxmod_GCS_min)
## Call:
## coxph(formula = surv_object ~ GCS_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## GCS_min 0.006052 1.006070 0.007324 0.826 0.409
##
## exp(coef) exp(-coef) lower .95 upper .95
## GCS_min 1.006 0.994 0.9917 1.021
##
## Concordance= 0.501 (se = 0.01 )
## Likelihood ratio test= 0.68 on 1 df, p=0.4
## Wald test = 0.68 on 1 df, p=0.4
## Score (logrank) test = 0.68 on 1 df, p=0.4
summary(coxmod_GCS_diff)
## Call:
## coxph(formula = surv_object ~ GCS_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## GCS_diff -0.04492 0.95607 0.01685 -2.667 0.00766 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## GCS_diff 0.9561 1.046 0.925 0.9882
##
## Concordance= 0.518 (se = 0.01 )
## Likelihood ratio test= 7.24 on 1 df, p=0.007
## Wald test = 7.11 on 1 df, p=0.008
## Score (logrank) test = 7.13 on 1 df, p=0.008
summary(coxmod_U_max)
## Call:
## coxph(formula = surv_object ~ Urine_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Urine_max -0.0007737 0.9992266 0.0001047 -7.391 0.000000000000146 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Urine_max 0.9992 1.001 0.999 0.9994
##
## Concordance= 0.619 (se = 0.01 )
## Likelihood ratio test= 70.46 on 1 df, p=<0.0000000000000002
## Wald test = 54.63 on 1 df, p=0.0000000000001
## Score (logrank) test = 53.95 on 1 df, p=0.0000000000002
summary(coxmod_U_min)
## Call:
## coxph(formula = surv_object ~ Urine_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Urine_min -0.0019252 0.9980767 0.0007179 -2.682 0.00733 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Urine_min 0.9981 1.002 0.9967 0.9995
##
## Concordance= 0.525 (se = 0.01 )
## Likelihood ratio test= 8.51 on 1 df, p=0.004
## Wald test = 7.19 on 1 df, p=0.007
## Score (logrank) test = 7.23 on 1 df, p=0.007
summary(coxmod_U_diff)
## Call:
## coxph(formula = surv_object ~ Urine_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Urine_diff -0.0007208 0.9992795 0.0001063 -6.78 0.000000000012 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Urine_diff 0.9993 1.001 0.9991 0.9995
##
## Concordance= 0.612 (se = 0.01 )
## Likelihood ratio test= 59.4 on 1 df, p=0.00000000000001
## Wald test = 45.97 on 1 df, p=0.00000000001
## Score (logrank) test = 45.77 on 1 df, p=0.00000000001
summary(coxmod_lactate_max)
## Call:
## coxph(formula = surv_object ~ Lactate_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Lactate_max 0.05778 1.05948 0.01666 3.467 0.000526 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Lactate_max 1.059 0.9439 1.025 1.095
##
## Concordance= 0.508 (se = 0.011 )
## Likelihood ratio test= 10.95 on 1 df, p=0.0009
## Wald test = 12.02 on 1 df, p=0.0005
## Score (logrank) test = 12.03 on 1 df, p=0.0005
summary(coxmod_lactate_min)
## Call:
## coxph(formula = surv_object ~ Lactate_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Lactate_min 0.09916 1.10424 0.02685 3.693 0.000222 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Lactate_min 1.104 0.9056 1.048 1.164
##
## Concordance= 0.525 (se = 0.011 )
## Likelihood ratio test= 12.03 on 1 df, p=0.0005
## Wald test = 13.64 on 1 df, p=0.0002
## Score (logrank) test = 13.52 on 1 df, p=0.0002
summary(coxmod_lactate_diff)
## Call:
## coxph(formula = surv_object ~ Lactate_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Lactate_diff 0.11039 1.11672 0.02168 5.092 0.000000355 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Lactate_diff 1.117 0.8955 1.07 1.165
##
## Concordance= 0.512 (se = 0.011 )
## Likelihood ratio test= 21.03 on 1 df, p=0.000005
## Wald test = 25.93 on 1 df, p=0.0000004
## Score (logrank) test = 25.86 on 1 df, p=0.0000004
summary(coxmod_HCT_max)
## Call:
## coxph(formula = surv_object ~ HCT_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## HCT_max -0.025207 0.975108 0.007591 -3.321 0.000898 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## HCT_max 0.9751 1.026 0.9607 0.9897
##
## Concordance= 0.536 (se = 0.011 )
## Likelihood ratio test= 11.25 on 1 df, p=0.0008
## Wald test = 11.03 on 1 df, p=0.0009
## Score (logrank) test = 11.03 on 1 df, p=0.0009
summary(coxmod_HCT_min)
## Call:
## coxph(formula = surv_object ~ HCT_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## HCT_min -0.010433 0.989621 0.006279 -1.662 0.0966 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## HCT_min 0.9896 1.01 0.9775 1.002
##
## Concordance= 0.516 (se = 0.01 )
## Likelihood ratio test= 2.77 on 1 df, p=0.1
## Wald test = 2.76 on 1 df, p=0.1
## Score (logrank) test = 2.76 on 1 df, p=0.1
summary(coxmod_HCT_diff)
## Call:
## coxph(formula = surv_object ~ HCT_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## HCT_diff -0.03417 0.96641 0.01065 -3.209 0.00133 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## HCT_diff 0.9664 1.035 0.9465 0.9868
##
## Concordance= 0.537 (se = 0.011 )
## Likelihood ratio test= 10.7 on 1 df, p=0.001
## Wald test = 10.3 on 1 df, p=0.001
## Score (logrank) test = 10.31 on 1 df, p=0.001
summary(coxmod_Cr_max)
## Call:
## coxph(formula = surv_object ~ Creatinine_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Creatinine_max 0.10152 1.10685 0.01467 6.92 0.0000000000045 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Creatinine_max 1.107 0.9035 1.075 1.139
##
## Concordance= 0.594 (se = 0.011 )
## Likelihood ratio test= 35.11 on 1 df, p=0.000000003
## Wald test = 47.89 on 1 df, p=0.000000000005
## Score (logrank) test = 48.54 on 1 df, p=0.000000000003
summary(coxmod_Cr_min)
## Call:
## coxph(formula = surv_object ~ Creatinine_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Creatinine_min 0.12901 1.13770 0.01751 7.367 0.000000000000175 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Creatinine_min 1.138 0.879 1.099 1.177
##
## Concordance= 0.606 (se = 0.01 )
## Likelihood ratio test= 41.47 on 1 df, p=0.0000000001
## Wald test = 54.27 on 1 df, p=0.0000000000002
## Score (logrank) test = 56.43 on 1 df, p=0.00000000000006
summary(coxmod_Cr_diff)
## Call:
## coxph(formula = surv_object ~ Creatinine_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Creatinine_diff 0.08556 1.08932 0.01793 4.772 0.00000182 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Creatinine_diff 1.089 0.918 1.052 1.128
##
## Concordance= 0.548 (se = 0.011 )
## Likelihood ratio test= 17.49 on 1 df, p=0.00003
## Wald test = 22.77 on 1 df, p=0.000002
## Score (logrank) test = 23.05 on 1 df, p=0.000002
summary(coxmod_Na_max)
## Call:
## coxph(formula = surv_object ~ Na_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Na_max -0.015698 0.984424 0.008287 -1.894 0.0582 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Na_max 0.9844 1.016 0.9686 1.001
##
## Concordance= 0.521 (se = 0.011 )
## Likelihood ratio test= 3.61 on 1 df, p=0.06
## Wald test = 3.59 on 1 df, p=0.06
## Score (logrank) test = 3.57 on 1 df, p=0.06
summary(coxmod_Na_min)
## Call:
## coxph(formula = surv_object ~ Na_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Na_min -0.021187 0.979036 0.007371 -2.875 0.00405 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Na_min 0.979 1.021 0.965 0.9933
##
## Concordance= 0.536 (se = 0.011 )
## Likelihood ratio test= 7.74 on 1 df, p=0.005
## Wald test = 8.26 on 1 df, p=0.004
## Score (logrank) test = 8.16 on 1 df, p=0.004
summary(coxmod_Na_diff)
## Call:
## coxph(formula = surv_object ~ Na_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Na_diff 0.042040 1.042936 0.007105 5.917 0.00000000328 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Na_diff 1.043 0.9588 1.029 1.058
##
## Concordance= 0.574 (se = 0.01 )
## Likelihood ratio test= 27.6 on 1 df, p=0.0000001
## Wald test = 35.01 on 1 df, p=0.000000003
## Score (logrank) test = 34.64 on 1 df, p=0.000000004
summary(coxmod_K_max)
## Call:
## coxph(formula = surv_object ~ K_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## K_max 0.07306 1.07579 0.02958 2.469 0.0135 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## K_max 1.076 0.9295 1.015 1.14
##
## Concordance= 0.527 (se = 0.011 )
## Likelihood ratio test= 4.81 on 1 df, p=0.03
## Wald test = 6.1 on 1 df, p=0.01
## Score (logrank) test = 5.98 on 1 df, p=0.01
summary(coxmod_K_min)
## Call:
## coxph(formula = surv_object ~ K_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## K_min 0.03906 1.03983 0.06125 0.638 0.524
##
## exp(coef) exp(-coef) lower .95 upper .95
## K_min 1.04 0.9617 0.9222 1.172
##
## Concordance= 0.502 (se = 0.011 )
## Likelihood ratio test= 0.41 on 1 df, p=0.5
## Wald test = 0.41 on 1 df, p=0.5
## Score (logrank) test = 0.41 on 1 df, p=0.5
summary(coxmod_K_diff)
## Call:
## coxph(formula = surv_object ~ K_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## K_diff 0.08674 1.09062 0.03006 2.886 0.00391 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## K_diff 1.091 0.9169 1.028 1.157
##
## Concordance= 0.542 (se = 0.01 )
## Likelihood ratio test= 5.99 on 1 df, p=0.01
## Wald test = 8.33 on 1 df, p=0.004
## Score (logrank) test = 8.39 on 1 df, p=0.004
summary(coxmod_trp_max)
## Call:
## coxph(formula = surv_object ~ TroponinI_max, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## TroponinI_max -0.00000905 0.99999095 0.00324098 -0.003 0.998
##
## exp(coef) exp(-coef) lower .95 upper .95
## TroponinI_max 1 1 0.9937 1.006
##
## Concordance= 0.508 (se = 0.011 )
## Likelihood ratio test= 0 on 1 df, p=1
## Wald test = 0 on 1 df, p=1
## Score (logrank) test = 0 on 1 df, p=1
summary(coxmod_trp_min)
## Call:
## coxph(formula = surv_object ~ TroponinI_min, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## TroponinI_min 0.001945 1.001947 0.003327 0.585 0.559
##
## exp(coef) exp(-coef) lower .95 upper .95
## TroponinI_min 1.002 0.9981 0.9954 1.009
##
## Concordance= 0.498 (se = 0.011 )
## Likelihood ratio test= 0.34 on 1 df, p=0.6
## Wald test = 0.34 on 1 df, p=0.6
## Score (logrank) test = 0.34 on 1 df, p=0.6
summary(coxmod_trp_diff)
## Call:
## coxph(formula = surv_object ~ TroponinI_diff, data = icu_sub2)
##
## n= 2061, number of events= 773
##
## coef exp(coef) se(coef) z Pr(>|z|)
## TroponinI_diff 0.002049 1.002051 0.003809 0.538 0.591
##
## exp(coef) exp(-coef) lower .95 upper .95
## TroponinI_diff 1.002 0.998 0.9946 1.01
##
## Concordance= 0.496 (se = 0.011 )
## Likelihood ratio test= 0.29 on 1 df, p=0.6
## Wald test = 0.29 on 1 df, p=0.6
## Score (logrank) test = 0.29 on 1 df, p=0.6
From the univariate models, it appears Age, RespRate_min, SysABP_max, Temp_max, GCS_max, Urine_max, Lactate_max, Creatinine_max are statistically significant predictors for survival.
We now fit some multivariate models.
surv_object_c <- Surv(icu_sub2c$Days, icu_sub2c$Status)
# all variables
coxmod_mv_all <- coxph(surv_object_c ~ Age + HR_max + HR_min + HR_diff + RespRate_max + RespRate_min + RespRate_diff + SaO2_max + SaO2_min + SaO2_diff + SysABP_max + SysABP_min + SysABP_diff + Temp_max + Temp_min + Temp_diff + GCS_max + GCS_min + GCS_diff + Urine_max + Urine_min + Urine_diff + Lactate_max + Lactate_min + Lactate_diff + HCT_max + HCT_min + HCT_diff + Creatinine_max + Creatinine_min + Creatinine_diff + Na_max + Na_min + Na_diff + K_max + K_min + K_diff + TroponinI_max + TroponinI_min + TroponinI_diff, data=icu_sub2c)
# only significant variables
coxmod_mv_uni <- coxph(surv_object_c ~ Age + HR_diff + RespRate_max + RespRate_min + RespRate_diff + SysABP_max + SysABP_diff + Temp_max + Temp_min + Temp_diff + GCS_max + GCS_diff + Urine_max + Urine_min + Urine_diff + Lactate_max + Lactate_min + Lactate_diff + HCT_max + HCT_diff + Creatinine_max + Creatinine_min + Creatinine_diff + Na_min + Na_diff + K_max + K_diff, data=icu_sub2c)
sum_all <- summary(coxmod_mv_all)
sum_all
## Call:
## coxph(formula = surv_object_c ~ Age + HR_max + HR_min + HR_diff +
## RespRate_max + RespRate_min + RespRate_diff + SaO2_max +
## SaO2_min + SaO2_diff + SysABP_max + SysABP_min + SysABP_diff +
## Temp_max + Temp_min + Temp_diff + GCS_max + GCS_min + GCS_diff +
## Urine_max + Urine_min + Urine_diff + Lactate_max + Lactate_min +
## Lactate_diff + HCT_max + HCT_min + HCT_diff + Creatinine_max +
## Creatinine_min + Creatinine_diff + Na_max + Na_min + Na_diff +
## K_max + K_min + K_diff + TroponinI_max + TroponinI_min +
## TroponinI_diff, data = icu_sub2c)
##
## n= 1346, number of events= 480
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Age 0.0366745 1.0373553 0.0038864 9.437 < 0.0000000000000002
## HR_max -0.0024127 0.9975902 0.0040203 -0.600 0.548414
## HR_min 0.0122872 1.0123630 0.0045747 2.686 0.007233
## HR_diff 0.0100539 1.0101046 0.0046360 2.169 0.030110
## RespRate_max -0.0159147 0.9842113 0.0182757 -0.871 0.383856
## RespRate_min -0.0283466 0.9720514 0.0176052 -1.610 0.107372
## RespRate_diff 0.0389168 1.0396839 0.0192013 2.027 0.042685
## SaO2_max -0.0630377 0.9389081 0.0248164 -2.540 0.011080
## SaO2_min -0.0167283 0.9834109 0.0342957 -0.488 0.625715
## SaO2_diff -0.0089772 0.9910630 0.0366000 -0.245 0.806241
## SysABP_max -0.0048270 0.9951847 0.0036025 -1.340 0.180278
## SysABP_min 0.0014086 1.0014096 0.0027414 0.514 0.607358
## SysABP_diff 0.0124514 1.0125292 0.0043185 2.883 0.003936
## Temp_max -0.2889862 0.7490226 0.1007526 -2.868 0.004127
## Temp_min 0.0287428 1.0291598 0.1026217 0.280 0.779412
## Temp_diff 0.1159784 1.1229716 0.1104795 1.050 0.293822
## GCS_max -0.1170651 0.8895273 0.0196783 -5.949 0.0000000027
## GCS_min -0.0175839 0.9825698 0.0231249 -0.760 0.447023
## GCS_diff -0.0367557 0.9639116 0.0384615 -0.956 0.339251
## Urine_max -0.0032547 0.9967506 0.0012598 -2.584 0.009779
## Urine_min 0.0017853 1.0017869 0.0014427 1.238 0.215897
## Urine_diff 0.0030560 1.0030607 0.0012956 2.359 0.018342
## Lactate_max -0.0656785 0.9364319 0.0411289 -1.597 0.110289
## Lactate_min 0.0328107 1.0333549 0.0401822 0.817 0.414186
## Lactate_diff 0.1132221 1.1198806 0.0464257 2.439 0.014737
## HCT_max -0.0500442 0.9511874 0.0203604 -2.458 0.013974
## HCT_min 0.0187108 1.0188870 0.0155944 1.200 0.230202
## HCT_diff 0.0065199 1.0065412 0.0183780 0.355 0.722764
## Creatinine_max 0.1364667 1.1462167 0.1738609 0.785 0.432501
## Creatinine_min 0.1804765 1.1977880 0.1592844 1.133 0.257195
## Creatinine_diff -0.2708730 0.7627133 0.1182898 -2.290 0.022027
## Na_max 0.0002665 1.0002665 0.0239580 0.011 0.991125
## Na_min -0.0202542 0.9799495 0.0254659 -0.795 0.426411
## Na_diff 0.0558943 1.0574859 0.0145153 3.851 0.000118
## K_max 0.0987985 1.1038438 0.1275291 0.775 0.438509
## K_min -0.1355602 0.8732266 0.1172407 -1.156 0.247577
## K_diff 0.0664321 1.0686884 0.1348975 0.492 0.622391
## TroponinI_max -0.0262429 0.9740984 0.0126410 -2.076 0.037893
## TroponinI_min 0.0218979 1.0221394 0.0115996 1.888 0.059050
## TroponinI_diff -0.0002753 0.9997248 0.0170100 -0.016 0.987089
##
## Age ***
## HR_max
## HR_min **
## HR_diff *
## RespRate_max
## RespRate_min
## RespRate_diff *
## SaO2_max *
## SaO2_min
## SaO2_diff
## SysABP_max
## SysABP_min
## SysABP_diff **
## Temp_max **
## Temp_min
## Temp_diff
## GCS_max ***
## GCS_min
## GCS_diff
## Urine_max **
## Urine_min
## Urine_diff *
## Lactate_max
## Lactate_min
## Lactate_diff *
## HCT_max *
## HCT_min
## HCT_diff
## Creatinine_max
## Creatinine_min
## Creatinine_diff *
## Na_max
## Na_min
## Na_diff ***
## K_max
## K_min
## K_diff
## TroponinI_max *
## TroponinI_min .
## TroponinI_diff
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Age 1.0374 0.9640 1.0295 1.0453
## HR_max 0.9976 1.0024 0.9898 1.0055
## HR_min 1.0124 0.9878 1.0033 1.0215
## HR_diff 1.0101 0.9900 1.0010 1.0193
## RespRate_max 0.9842 1.0160 0.9496 1.0201
## RespRate_min 0.9721 1.0288 0.9391 1.0062
## RespRate_diff 1.0397 0.9618 1.0013 1.0796
## SaO2_max 0.9389 1.0651 0.8943 0.9857
## SaO2_min 0.9834 1.0169 0.9195 1.0518
## SaO2_diff 0.9911 1.0090 0.9225 1.0648
## SysABP_max 0.9952 1.0048 0.9882 1.0022
## SysABP_min 1.0014 0.9986 0.9960 1.0068
## SysABP_diff 1.0125 0.9876 1.0040 1.0211
## Temp_max 0.7490 1.3351 0.6148 0.9125
## Temp_min 1.0292 0.9717 0.8416 1.2584
## Temp_diff 1.1230 0.8905 0.9043 1.3945
## GCS_max 0.8895 1.1242 0.8559 0.9245
## GCS_min 0.9826 1.0177 0.9390 1.0281
## GCS_diff 0.9639 1.0374 0.8939 1.0394
## Urine_max 0.9968 1.0033 0.9943 0.9992
## Urine_min 1.0018 0.9982 0.9990 1.0046
## Urine_diff 1.0031 0.9969 1.0005 1.0056
## Lactate_max 0.9364 1.0679 0.8639 1.0150
## Lactate_min 1.0334 0.9677 0.9551 1.1180
## Lactate_diff 1.1199 0.8930 1.0225 1.2266
## HCT_max 0.9512 1.0513 0.9140 0.9899
## HCT_min 1.0189 0.9815 0.9882 1.0505
## HCT_diff 1.0065 0.9935 0.9709 1.0435
## Creatinine_max 1.1462 0.8724 0.8152 1.6116
## Creatinine_min 1.1978 0.8349 0.8766 1.6367
## Creatinine_diff 0.7627 1.3111 0.6049 0.9617
## Na_max 1.0003 0.9997 0.9544 1.0484
## Na_min 0.9799 1.0205 0.9322 1.0301
## Na_diff 1.0575 0.9456 1.0278 1.0880
## K_max 1.1038 0.9059 0.8597 1.4173
## K_min 0.8732 1.1452 0.6940 1.0988
## K_diff 1.0687 0.9357 0.8204 1.3921
## TroponinI_max 0.9741 1.0266 0.9503 0.9985
## TroponinI_min 1.0221 0.9783 0.9992 1.0456
## TroponinI_diff 0.9997 1.0003 0.9669 1.0336
##
## Concordance= 0.756 (se = 0.011 )
## Likelihood ratio test= 405.9 on 40 df, p=<0.0000000000000002
## Wald test = 428.6 on 40 df, p=<0.0000000000000002
## Score (logrank) test = 471.8 on 40 df, p=<0.0000000000000002
sum_uni <- summary(coxmod_mv_uni)
sum_uni
## Call:
## coxph(formula = surv_object_c ~ Age + HR_diff + RespRate_max +
## RespRate_min + RespRate_diff + SysABP_max + SysABP_diff +
## Temp_max + Temp_min + Temp_diff + GCS_max + GCS_diff + Urine_max +
## Urine_min + Urine_diff + Lactate_max + Lactate_min + Lactate_diff +
## HCT_max + HCT_diff + Creatinine_max + Creatinine_min + Creatinine_diff +
## Na_min + Na_diff + K_max + K_diff, data = icu_sub2c)
##
## n= 1346, number of events= 480
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Age 0.031026 1.031513 0.003569 8.693 < 0.0000000000000002 ***
## HR_diff 0.006319 1.006339 0.002379 2.656 0.00791 **
## RespRate_max -0.017655 0.982500 0.017638 -1.001 0.31685
## RespRate_min -0.009486 0.990558 0.017205 -0.551 0.58139
## RespRate_diff 0.038530 1.039282 0.018836 2.046 0.04080 *
## SysABP_max -0.004123 0.995885 0.002953 -1.396 0.16266
## SysABP_diff 0.010669 1.010727 0.003829 2.787 0.00532 **
## Temp_max -0.279709 0.756003 0.097283 -2.875 0.00404 **
## Temp_min 0.092078 1.096450 0.100764 0.914 0.36082
## Temp_diff 0.123261 1.131179 0.110450 1.116 0.26443
## GCS_max -0.118481 0.888268 0.017033 -6.956 0.0000000000035 ***
## GCS_diff -0.051325 0.949970 0.023432 -2.190 0.02850 *
## Urine_max -0.003461 0.996545 0.001228 -2.820 0.00481 **
## Urine_min 0.001277 1.001278 0.001316 0.971 0.33172
## Urine_diff 0.003235 1.003241 0.001264 2.561 0.01045 *
## Lactate_max -0.048952 0.952227 0.039516 -1.239 0.21543
## Lactate_min 0.035337 1.035969 0.038464 0.919 0.35825
## Lactate_diff 0.121682 1.129395 0.045824 2.655 0.00792 **
## HCT_max -0.018848 0.981328 0.011580 -1.628 0.10361
## HCT_diff -0.004972 0.995040 0.015823 -0.314 0.75335
## Creatinine_max 0.145474 1.156588 0.167426 0.869 0.38491
## Creatinine_min 0.189222 1.208309 0.149279 1.268 0.20495
## Creatinine_diff -0.303918 0.737921 0.116603 -2.606 0.00915 **
## Na_min -0.016707 0.983432 0.009784 -1.708 0.08772 .
## Na_diff 0.054959 1.056497 0.011857 4.635 0.0000035650049 ***
## K_max 0.036094 1.036754 0.094147 0.383 0.70144
## K_diff 0.140343 1.150668 0.117254 1.197 0.23134
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Age 1.0315 0.9694 1.0243 1.0388
## HR_diff 1.0063 0.9937 1.0017 1.0110
## RespRate_max 0.9825 1.0178 0.9491 1.0171
## RespRate_min 0.9906 1.0095 0.9577 1.0245
## RespRate_diff 1.0393 0.9622 1.0016 1.0784
## SysABP_max 0.9959 1.0041 0.9901 1.0017
## SysABP_diff 1.0107 0.9894 1.0032 1.0183
## Temp_max 0.7560 1.3227 0.6248 0.9148
## Temp_min 1.0965 0.9120 0.9000 1.3359
## Temp_diff 1.1312 0.8840 0.9110 1.4046
## GCS_max 0.8883 1.1258 0.8591 0.9184
## GCS_diff 0.9500 1.0527 0.9073 0.9946
## Urine_max 0.9965 1.0035 0.9941 0.9989
## Urine_min 1.0013 0.9987 0.9987 1.0039
## Urine_diff 1.0032 0.9968 1.0008 1.0057
## Lactate_max 0.9522 1.0502 0.8813 1.0289
## Lactate_min 1.0360 0.9653 0.9607 1.1171
## Lactate_diff 1.1294 0.8854 1.0324 1.2355
## HCT_max 0.9813 1.0190 0.9593 1.0039
## HCT_diff 0.9950 1.0050 0.9647 1.0264
## Creatinine_max 1.1566 0.8646 0.8330 1.6058
## Creatinine_min 1.2083 0.8276 0.9018 1.6190
## Creatinine_diff 0.7379 1.3552 0.5872 0.9274
## Na_min 0.9834 1.0168 0.9648 1.0025
## Na_diff 1.0565 0.9465 1.0322 1.0813
## K_max 1.0368 0.9645 0.8621 1.2468
## K_diff 1.1507 0.8691 0.9144 1.4480
##
## Concordance= 0.749 (se = 0.011 )
## Likelihood ratio test= 379.6 on 27 df, p=<0.0000000000000002
## Wald test = 397.7 on 27 df, p=<0.0000000000000002
## Score (logrank) test = 439.7 on 27 df, p=<0.0000000000000002
# According to the summary of coxmod_all, use only significant vars.
coxmod_mv_redc <- coxph(surv_object_c ~ Age + HR_min + HR_diff + RespRate_diff + SaO2_max + SysABP_diff + Temp_max + GCS_max + Urine_max + Urine_diff + Lactate_diff + HCT_max + Creatinine_diff + Na_diff + TroponinI_max, data=icu_sub2c)
sum_redc <- summary(coxmod_mv_redc)
sum_redc
## Call:
## coxph(formula = surv_object_c ~ Age + HR_min + HR_diff + RespRate_diff +
## SaO2_max + SysABP_diff + Temp_max + GCS_max + Urine_max +
## Urine_diff + Lactate_diff + HCT_max + Creatinine_diff + Na_diff +
## TroponinI_max, data = icu_sub2c)
##
## n= 1346, number of events= 480
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Age 0.033259 1.033818 0.003528 9.427 < 0.0000000000000002 ***
## HR_min 0.006521 1.006542 0.003215 2.028 0.042543 *
## HR_diff 0.008625 1.008662 0.002417 3.569 0.000358 ***
## RespRate_diff 0.019845 1.020043 0.007096 2.797 0.005164 **
## SaO2_max -0.066007 0.936124 0.022146 -2.981 0.002877 **
## SysABP_diff 0.007242 1.007268 0.002086 3.472 0.000517 ***
## Temp_max -0.286684 0.750749 0.068475 -4.187 0.0000283028463 ***
## GCS_max -0.095164 0.909224 0.014491 -6.567 0.0000000000513 ***
## Urine_max -0.004329 0.995681 0.001131 -3.828 0.000129 ***
## Urine_diff 0.004031 1.004039 0.001172 3.439 0.000583 ***
## Lactate_diff 0.074873 1.077747 0.023042 3.249 0.001156 **
## HCT_max -0.029547 0.970885 0.010736 -2.752 0.005918 **
## Creatinine_diff 0.039996 1.040807 0.029295 1.365 0.172157
## Na_diff 0.063125 1.065160 0.011227 5.623 0.0000000188188 ***
## TroponinI_max -0.009030 0.991011 0.004544 -1.987 0.046908 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Age 1.0338 0.9673 1.0267 1.0410
## HR_min 1.0065 0.9935 1.0002 1.0129
## HR_diff 1.0087 0.9914 1.0039 1.0135
## RespRate_diff 1.0200 0.9804 1.0060 1.0343
## SaO2_max 0.9361 1.0682 0.8964 0.9777
## SysABP_diff 1.0073 0.9928 1.0032 1.0114
## Temp_max 0.7507 1.3320 0.6565 0.8586
## GCS_max 0.9092 1.0998 0.8838 0.9354
## Urine_max 0.9957 1.0043 0.9935 0.9979
## Urine_diff 1.0040 0.9960 1.0017 1.0063
## Lactate_diff 1.0777 0.9279 1.0302 1.1275
## HCT_max 0.9709 1.0300 0.9507 0.9915
## Creatinine_diff 1.0408 0.9608 0.9827 1.1023
## Na_diff 1.0652 0.9388 1.0420 1.0889
## TroponinI_max 0.9910 1.0091 0.9822 0.9999
##
## Concordance= 0.745 (se = 0.011 )
## Likelihood ratio test= 359.7 on 15 df, p=<0.0000000000000002
## Wald test = 384 on 15 df, p=<0.0000000000000002
## Score (logrank) test = 411.7 on 15 df, p=<0.0000000000000002
# According to the summary of coxmod_uni, use only significant vars.
coxmod_mv_uni_redc <- coxph(surv_object_c ~ Age + HR_diff + RespRate_diff + SysABP_diff + Temp_max + GCS_max + GCS_diff + Urine_max + Urine_diff + Lactate_diff + Creatinine_diff + Na_min + Na_diff, data=icu_sub2c)
sum_uni_redc <- summary(coxmod_mv_uni_redc)
sum_uni_redc
## Call:
## coxph(formula = surv_object_c ~ Age + HR_diff + RespRate_diff +
## SysABP_diff + Temp_max + GCS_max + GCS_diff + Urine_max +
## Urine_diff + Lactate_diff + Creatinine_diff + Na_min + Na_diff,
## data = icu_sub2c)
##
## n= 1346, number of events= 480
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Age 0.031233 1.031725 0.003439 9.082 < 0.0000000000000002 ***
## HR_diff 0.006593 1.006614 0.002365 2.787 0.005319 **
## RespRate_diff 0.019673 1.019867 0.006852 2.871 0.004088 **
## SysABP_diff 0.006408 1.006429 0.002084 3.076 0.002101 **
## Temp_max -0.235642 0.790064 0.064731 -3.640 0.000272 ***
## GCS_max -0.110601 0.895296 0.014961 -7.393 0.000000000000144 ***
## GCS_diff -0.059815 0.941939 0.021879 -2.734 0.006259 **
## Urine_max -0.004284 0.995725 0.001123 -3.814 0.000137 ***
## Urine_diff 0.004020 1.004028 0.001166 3.448 0.000564 ***
## Lactate_diff 0.092448 1.096856 0.023704 3.900 0.000096129953422 ***
## Creatinine_diff 0.036649 1.037329 0.028983 1.265 0.206049
## Na_min -0.019679 0.980514 0.009216 -2.135 0.032736 *
## Na_diff 0.057274 1.058946 0.011349 5.047 0.000000449708978 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Age 1.0317 0.9693 1.0248 1.0387
## HR_diff 1.0066 0.9934 1.0020 1.0113
## RespRate_diff 1.0199 0.9805 1.0063 1.0337
## SysABP_diff 1.0064 0.9936 1.0023 1.0105
## Temp_max 0.7901 1.2657 0.6959 0.8969
## GCS_max 0.8953 1.1169 0.8694 0.9219
## GCS_diff 0.9419 1.0616 0.9024 0.9832
## Urine_max 0.9957 1.0043 0.9935 0.9979
## Urine_diff 1.0040 0.9960 1.0017 1.0063
## Lactate_diff 1.0969 0.9117 1.0471 1.1490
## Creatinine_diff 1.0373 0.9640 0.9800 1.0980
## Na_min 0.9805 1.0199 0.9630 0.9984
## Na_diff 1.0589 0.9443 1.0357 1.0828
##
## Concordance= 0.741 (se = 0.011 )
## Likelihood ratio test= 350.1 on 13 df, p=<0.0000000000000002
## Wald test = 364.3 on 13 df, p=<0.0000000000000002
## Score (logrank) test = 395.4 on 13 df, p=<0.0000000000000002
anova(coxmod_mv_all, coxmod_mv_uni, coxmod_mv_redc, coxmod_mv_uni_redc)
## Analysis of Deviance Table
## Cox model: response is surv_object_c
## Model 1: ~ Age + HR_max + HR_min + HR_diff + RespRate_max + RespRate_min + RespRate_diff + SaO2_max + SaO2_min + SaO2_diff + SysABP_max + SysABP_min + SysABP_diff + Temp_max + Temp_min + Temp_diff + GCS_max + GCS_min + GCS_diff + Urine_max + Urine_min + Urine_diff + Lactate_max + Lactate_min + Lactate_diff + HCT_max + HCT_min + HCT_diff + Creatinine_max + Creatinine_min + Creatinine_diff + Na_max + Na_min + Na_diff + K_max + K_min + K_diff + TroponinI_max + TroponinI_min + TroponinI_diff
## Model 2: ~ Age + HR_diff + RespRate_max + RespRate_min + RespRate_diff + SysABP_max + SysABP_diff + Temp_max + Temp_min + Temp_diff + GCS_max + GCS_diff + Urine_max + Urine_min + Urine_diff + Lactate_max + Lactate_min + Lactate_diff + HCT_max + HCT_diff + Creatinine_max + Creatinine_min + Creatinine_diff + Na_min + Na_diff + K_max + K_diff
## Model 3: ~ Age + HR_min + HR_diff + RespRate_diff + SaO2_max + SysABP_diff + Temp_max + GCS_max + Urine_max + Urine_diff + Lactate_diff + HCT_max + Creatinine_diff + Na_diff + TroponinI_max
## Model 4: ~ Age + HR_diff + RespRate_diff + SysABP_diff + Temp_max + GCS_max + GCS_diff + Urine_max + Urine_diff + Lactate_diff + Creatinine_diff + Na_min + Na_diff
## loglik Chisq Df P(>|Chi|)
## 1 -3157.5
## 2 -3170.7 26.3247 13 0.015372 *
## 3 -3180.6 19.8185 12 0.070598 .
## 4 -3185.4 9.6746 2 0.007928 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Unsure if this means I should prefer Model 1 over the others?
Model 2 is preferred, coxmod_mv_uni.
mod1_AIC <- 2*(sum_all$logtest[["df"]])-2*sum_all$loglik[2]
mod2_AIC <- 2*(sum_uni$logtest[["df"]])-2*sum_uni$loglik[2]
mod3_AIC <- 2*(sum_redc$logtest[["df"]])-2*sum_redc$loglik[2]
mod4_AIC <- 2*(sum_uni_redc$logtest[["df"]])-2*sum_uni_redc$loglik[2]
# print AIC of each model
print(paste("AIC of coxmod_mv_all : ", mod1_AIC))
## [1] "AIC of coxmod_mv_all : 6395.07849677052"
print(paste("AIC of coxmod_mv_uni : ", mod2_AIC))
## [1] "AIC of coxmod_mv_uni : 6395.4032442264"
print(paste("AIC of coxmod_mv_redc : ", mod3_AIC))
## [1] "AIC of coxmod_mv_redc : 6391.22174133117"
print(paste("AIC of coxmod_mv_uni_redc : ", mod4_AIC))
## [1] "AIC of coxmod_mv_uni_redc : 6396.89637940166"
propmv2 <- cox.zph(coxmod_mv_uni)
propmv2
## chisq df p
## Age 4.5061 1 0.03377
## HR_diff 2.0669 1 0.15053
## RespRate_max 1.5695 1 0.21029
## RespRate_min 2.3830 1 0.12266
## RespRate_diff 0.4458 1 0.50436
## SysABP_max 3.0414 1 0.08117
## SysABP_diff 0.9931 1 0.31898
## Temp_max 0.9203 1 0.33741
## Temp_min 1.6280 1 0.20198
## Temp_diff 2.9932 1 0.08361
## GCS_max 21.4656 1 0.0000036
## GCS_diff 3.7476 1 0.05288
## Urine_max 3.6341 1 0.05661
## Urine_min 0.7971 1 0.37195
## Urine_diff 2.9563 1 0.08554
## Lactate_max 7.7776 1 0.00529
## Lactate_min 3.1142 1 0.07761
## Lactate_diff 5.2971 1 0.02136
## HCT_max 4.3411 1 0.03720
## HCT_diff 4.7201 1 0.02981
## Creatinine_max 1.4767 1 0.22429
## Creatinine_min 1.4262 1 0.23239
## Creatinine_diff 0.3104 1 0.57746
## Na_min 0.0312 1 0.85988
## Na_diff 0.5929 1 0.44131
## K_max 1.0045 1 0.31621
## K_diff 0.0907 1 0.76334
## GLOBAL 60.0869 27 0.00026
Select an initial subset of explanatory variables that you will use to predict survival. Justify your choice.
Conduct basic exploratory data analysis on your variables of choice.
Fit appropriate univariate Cox proportional hazards models.
Fit an appropriate series of multivariable Cox proportional hazards models, justifying your approach. Assess each model you consider for goodness of fit and other relevant statistics.
Present your final model. Your final model should not include all the predictor variables, just a small subset of them, which you have selected based on statistical significance and/or background knowledge.
For your final model, present a set of diagnostic statistics and/or charts and comment on them.
Write a paragraph or two summarising the most important findings of your final model. Include the most important values from the statistical output, and a simple clinical interpretation.
Create your response to this task here, as a mixture of embedded (knitr) R code and any resulting outputs, and explanatory or commentary text.
Reminder: don’t forget to save this file, to knit it to check that everything works, and then submit via the drop box in OpenLearning.
When you have finished, and are satisfied with your assessment solutions, and this file knits without errors and the output looks the way you want, then you should submit via the drop box in OpenLearning.
If you encounter problems with any part of the process described above, please contact the course convenor via OpenLearning as soon as possible so that the issues can be resolved in good time, and well before the assessment is due.
The instructions are deliberately open-ended and less prescriptive than the individual assessments to allow you some latitude in what you do and how you go about the task. However, to complete the tasks and gain full marks, you only need to replicate or repeat the steps covered in the course - if you do most or all of the things described in the revalant chapters of the HDAT9600 course, full marks will be awarded.
Note also that with respect to the model fitting, there are no right or wrong answers when it comes to variable selection and other aspects of model specification. Deep understanding of the underlying medical concepts which govern patient treatment and outcomes in ICUs is not required or assumed, although you should try to gain some understanding of each variable using the links provided. You will not be marked down if your medical justifications are not exactly correct or complete, but do you best, and don’t hesitate to seek help from the course convenor.